Posted: . At: 12:28 PM. This was 5 years ago. Post ID: 13354
Page permalink. WordPress uses cookies, or tiny pieces of information stored on your computer, to verify who you are. There are cookies for logged in users and for commenters.
These cookies expire two weeks after they are set.


Discussion of virtualization and database servers.


I have been researching a system for deploying a database cluster. I have chosen the Microsoft SQL Server product. This does support server clustering and uses nodes clustered together on a network to create a failover network that will automatically maintain operation if one node is to fail. A virtual network of database nodes spread across a network would be sufficient to support a data warehouse. The SQL installation will detect that the clustered servers are on the network and prompt you to add the server as a node. The server itself will be virtualized on the virtual instance and the data itself will be stored on a shared SAN-NAS disk that will be used by all virtual SQL server nodes.

All servers then are virtualized and if one goes down; the data is not lost. The new instance may be installed and the new instance is connected to the data store to re-synchronize with the database. I have experimented with SQL server 2012 and Windows Server 2012. The clustering will function perfectly when the servers are joined together as one whole network. The server requires the aforementioned drive that will be shared storage for all the SQL server instances. A shared folder mapped to a drive letter does not seem to work at all.

My data warehousing solution will use Microsoft SQL server 2017 Enterprise. This product supports data warehousing as a feature. This uses a database to handle massive amounts of data that is generated by a business. This is intended to store data that will not change once entered, this is historical data that is intended for future reference. A data warehouse is typically not in third normal form as a normal database will be. One main feature of a data warehouse is that is to support ad-hoc queries. This means that the database must be constructed to accommodate this.

Optimising the query performance is very important; usually a star schema is used to achieve this. Spreading the load across a cluster of SQL servers would be an important way to achieve a fast and useful data warehouse. One important operation in an SQL server warehouse is to clean and process data before entering it into the database. This is typically done via a staging area, this processes data and simplifies the process of building summaries before the data ends up in the data warehouse. Meta-data is also added. Making the task of data mining very easy when a researcher wants to connect to the database to find out sales information.


Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.