MSBI Interview Questions and Answers For Freshers and Experienced
- By Gurunatha Dogi in MSBI
- May 7th, 2016
- 22751
- 2
In this article we will see complete msbi interview questions and answers, If you are looking for job in MSBI then well-prepare this question and answer series we will keep updating this page with more questions and answers. This Q&A not only help you to get a job but also will help you to learn MSBI. It is also helpful for both For Experienced and as well for Freshers
This article written by "Ahtesham Shaikh", Ahtesham Shaikh is MSBI (Microsoft Business Intelligence) expert, He not only writes for MSBI but also take trainings on MSBI for experienced and as well for freshers.
If you want to watch MSBI step by step videos or view tutorials kindly visit this important link - MSBI Website
What do you mean by BI?
In simple words BI stands for Business Intelligence. It is a process of converting data in to useful information required for doing business, taking appropriate decision to make it move in positive direction which will fetch profits.
Considering a small example below is a sample customer data. Now when someone looks at the below data it probably does not make sense. But the time we analyze it and say "Majority of Customers buy toys" it starts making lot of sense.
That small sentence above is nothing but information. That information will help your forecast and analyze how you would like to move ahead.
Customer Name | Product bought |
---|---|
Shiv | Toys |
Shaam | Toys |
Raju | Shirts |
Ajay | Toys |
Khadak | Pants |
What is MSBI
Business Intelligence is a concept in general term. To implement this concept various organization like Oracle, Microsoft have come up with their various tools. For Oracle It is Oracle Business Intelligence tool. Informatica is also one of the tool for Business Intelligence. So coming to our question that word MSBI stands for Microsoft Business Intelligence.
Various component of MSBI
Now as we have understood about MSBI so now let us try to look into the details of MSBI. It is a composite of three separate entities. Each component follows one after the other.
SSIS (SQL Server Integration Services): Under SSIS ETL process takes place where raw data is extracted from various source of data, transformed it into desired form as required by user and then finally it is loaded in the database.
SSAS (SQL Server Analysis Services): Under SSAS loaded data is now analyzed as per business point of view and using CUBE essential analysis is taken place using SSAS.
And SSRS (SQL Server Reporting Services): Finally under SSRS once the SSIS and SSAS is completed with the help of reporting services reports are generated. Using this reports decision making becomes easier.
Explain in brief working of SSIS component in MSBI?
SSIS comprises of ETL process where raw data or raw information is taken from CSV form, excel or Microsoft access and stored in Dataware house. SSIS takes care of raw data (data source) and stores it in Dataware house which is not very easy it has to go through ETL (Extraction, Transformation and Loading) process.
While going through the journey of process called ETL: -
- Extraction from raw data is done it first identifies whether raw data is CSV file, binary, SQL file or Microsoft Access. Once the identification of data source is done then it's Transformation is done.
- In the Transformation process various data type used in raw data are brought into common data form in order to avoid confusion.
- Once the Extraction and Transformation process is done then loading of that data source is done into DataWare house.
What is the life cycle of BI?
As we said previously BI converting data in to information. Converting data in to information is a complex process. It involves lot of steps let us understand the same.
Below are the important steps in BI life cycle: -
- It takes raw data then it will do extraction in order to know which type of raw data it is whether CSV, binary format, SQL file and Microsoft Access file.
- Once type of the data is identified then it is transformed into common data form through extraction process.
- After successful transformation then loading of the data into dataware house.
- Data after reaching dataware house data's analysis is done and then stored in data structure called CUBE.
- Finally massaged stored data is ready for displaying to the user as useful information through reporting.
What are the various Components of MSBI in detail?
There are 3 major components of MSBI: -
SSiS(SQL Server Integration Service): It is a process from where the raw data or raw information is taken from CSV form, excel or Microsoft access and stored in Dataware house. SSIS takes care of raw data(data source) and stores it in Dataware house which is not very easy it has to go through ETL(Extraction, Transformation and Loading).
In the process of ETL: -
- Extraction from raw data is done it first identifies whether raw data is CSV file, binary, SQL file or Microsoft access. Once the identification of data source is done then it is transformation is done.
- In the transformation process various data type used in raw data are brought into common data form in order to avoid confusion.
- Once the extraction and transformation process is done then loading of that data source is done into dataware house.
So while creating project for SSIS use Integration Services project on Visual Studio as shown in the image below: -
SSAS (SQL Server Analysis Service) once data file reaches dataware house after going through ETL process, this analysis are stored in data structure called CUBE. It provides deeper and faster data analysis. CUBES are multi-dimensional data sources which have dimensions and facts (measures) as its basic constituents.
Creating CUBE for SSAS use Analysis Services project on Visual Studio as shown in the image below: -
SSRS (SQL Server Reporting Service) Reporting Services is a tool that will read the data from CUBE and stored data is displayed as useful information to the user.
With Report Server Project Wizard on Visual Studio reports are displayed to show useful information data as shown in the image below :
What is the importance of Data warehouse database?
Data warehouse is the location where raw data is stored after going through the process of E(Extraction), T(Transformation) and L(Loading), The prime importance of data warehouse is that it stores various data sources(CSV form, SQL files, oracle files, excel or Microsoft access) which serves organization to do analysis in order to achieve optimum benefits.
It also provides following more benefits: -
Data warehouse delivers extended Business Intelligence: As it uses various data source so it is never limited to data and because of that decision making is fast which goes in favour of the organization.
Data warehouse saves times: There is a Chinese proverb which says that Time is equal money so Data warehouse saves both time and money as it has number of data from different resources which helps to make decision in no time.
What is the difference between SQL Server and SQL Server Data Tools (SSDT)?
SQL Server is full-fledged database with actual RDBMS where we store and retrieve data on the same system or different system through network, create tables and write stored procedure etc whereas SQL Server Data Tools (SSDT) is a tool to create complete Business Intelligence with Microsoft BI tools i.e. SSIS, SSAS and SSRS. After you do the installation of SQL Server and SQL Server Data Tools (SSDT) you can open from your Start Menu of system.
SSDT also offers building light database within Visual Studio through Object Explorer and also creation and editing of database objects and data or can execute queries.
What is the difference between data flow and control flow?
Data flow is made of ETL whereas Control flow is made up of tasks connected with precedence constraints.
In Dataflow none of the component will wait for other component to work whereas Control flow new task won't proceed unless previous task is finished.
SSIS Toolbox properties are also different for both Dataflow and Control flow.
Control flow has data flow code and also Control flow invokes the data flow following is the image which display's the same.
What is the concept of connection manager in a source component?
Connection manager serves as connectivity in order to do configuration between actual source raw data file and extraction component of business intelligence (BI). Following figure depicts how Flat File Source is actually getting connected to the CSV file through Connection Manager.
Where will you find ETL in Visual Studio?
In order to see ETL within your Visual Studio you have to go to project and by opening Package.dtsx file under that click on Data flow. Once you are on Data flow then open SSIS ToolBox as shown in the following image :-
How is the SSIS project structured?
In order to see practically SSIS project structure you have to open Solution Explorer in Visual Studio get to see more in the following image. Under the Solution you will find SSIS project followed by SSIS packages and then Package.dtsx. So all the data flow, control flow, Parameters, Event Handler and Package Explorer go inside Package.dtsx file.
What is the Package file of SSIS project made up of?
SSIS project file has extension of .dtsx(data transformation services) internally dtsx file is made up of XML file. Following is the snapshot of the internal XML file of Package.dtsx.
What is the need of Data viewer?
It is a feature offered within SSIS which is need to enabled by doing right click on the pipeline and click "Enable Data Viewer" as shown in the below image :-
The main purpose of Date viewer tool is to debug SSIS program. It will always help you to know the inner details between two components. Once you see the symbol on the path as shown in the below image then it means that Data Viewer is activated between them that while will help you to do debugging. It will halt the execution of program and open a small window/grid which carries data and you will come to know exact status of the data like what rows are being filled and columns are getting displayed and what is the value getting filled in rows and columns etc.
What to do when deserializing of the package does not happen in SSIS project?
The main reason behind not able to do deserializing of the package in SSIS project which display's following error message as shown below. This is because of the 32 bit and 64 bit run time issues
So in order to resolve it go to the project properties by doing right click on it and click on Properties and
turn 64 bit to FALSE as shown in the below image so that project can run smoothly on 32 bit runtime and we should get rid of the error which we were getting.
How can we figure out error's details in SSIS project?
Following is the pictorial representation of SSIS project where red crossed mark means an error has been occurred. This red mark only states that error has been occurred but there are no details so that we can rectify that error to make our project work.
So in order to get the error details under Package.dtsx file click on the "Progress" tab and then go to the red mark and just next to it written description for the cause of that error following image shows the red cross mark error with its details. This Progress tab actually tells what exactly is happening behind when the Package.dtsx file is executing. If the error you are not able to read it then just do right click and copy-paste it on notepad to understand the error clearly.
How to apply IF condition in SSIS?
Conditions are applied in SSIS using Conditional Split component as shown in the following image in order to validate data is proper or to restrict unwanted data to get entered in the main table.
Above is the scenario from where we want only proper amount should enter in the Customer data ware house and unwanted or junk data should go in Flat File Destination. Just by doing right click on Conditional Split set the operators or simply drag and drop the condition which states where amount equal or less than 0 should not enter our Customer data ware house and instead it should go in error file named Flat File Destination.
When you see RED color in the IF condition expression box what does that mean?
Even when your expression is correct you will see RED colored on expression if the data type is not correct as you see in the following image snapshot: -
Data type should be numeric so that we can do comparison if data type is string or other than numeric it will show you expression in red color as shown in the above image.
For example if your data coming is from CSV file it will be purely in simple text form which is in string form and such data from CSV format is treated as string and for comparison purpose we need numeric fields. So in order to apply comparison we will first convert CSV string data field into numeric field.
Where can you see the data types for source and destination?
Data types of source and destination can be seen under Connection Manager as shown in the below image. Just do double click on source in this example we have CSV connection and "CustomerDatawareHouse" as destination. By clicking on each you will be able to see its data types.
We have decided to view data type of source element so we have double clicked on CSV connection which is source
WFor currency data type which SSIS data type matches?
In SSIS for the currency we have data type currency[DT_CY] you can get the same just by doing right click in the component and click on the Edit present in it which will in turn open Data Conversion Transformation Editor as you can see the same in the following image screen
By default CSV files have which data types?
In order to see default datatype for CSV file practically just do a right click on the CSV connection of Connection Manager its editor where you will be able to find datatype. Data coming from CSV file is simple text format which is string data type string [DT_STR] following is the snapshot CSV file component.
You want to convert string to a INT type what will you do?
In such scenarios where we have to do the conversion of string to INT, MONEY other any other data types we have to pull Data conversion component from the SSIS Toolbox as shown in the following image: -
Once we have Data conversion component on Data Flow then just do right click and click Edit which in turn open Data transformation Editor then from the Data Type select four-byte signed integer in SSIS Data transformation Editor to map for INT data type.
What is the default behavior of SSIS application when error happens?
Whenever an error occurs in Data Flow task of SSIS application Fail Component is the default option set for both Error and Truncation following is the figure which depicts the same. As default behavior is Fail Component so in the time of error it will throw an exception and it will stop processing Data Flow Task. This Fail Component will fail the entire step and then it will direct the package down and stop it through the red arrow path.
What is the difference between Fail Component, Redirect Rows and Ignore Failure?
Whenever an error is occurred in SSIS application during execution there are three handling options for the same: -
Fail Component: - This is the default option set for all components within SSIS and whenever error is occurred when Fail Component is set Data Flow will stop down immediately.
Redirect Rows: - This option when set does not stop Data Flow and its output can be redirected and can be given as an input to other component to handle it separately.
Ignore Failure: - When this option is set then as the name suggest error is ignored and data row is directed to the output.
What is the difference between for loop and foreach loop container?
A For Loop container loops till a constant value. So for example if you want to loop until a value is 12 or less than 12 so for upto fixed count you have to use For Loop container Whereas Foreach Loop container loops through collection so this collection can files in a folder or collection can be in record set.
What are variables and parameters in SSIS?
Variable are temporary storage where we can store value they are more internal to SSIS whenever you want to pass data from control flow to data flow whereas Parameter is helpful in passing values on the runtime and they are external to pass data. Parameters are used during the production.
What is the difference between variables and parameters?
Variable stores values so that SSIS package containing component, tasks can use at run time. | SSIS parameters allow to assign values to properties within package at the time of package execution. |
They are more internal to SSIS application to pass data. | They are more internal to SSIS application to pass data. |
They are just written with simple name as given. | They are denoted with $ symbol by the name. |
They help to pass data from control flow to data flow. | They are passed from outside and help during deployment in production server. |
What is the importance of expressions in SSIS?
The very importance of expressions in SSIS is that it is dynamic and helps you set values of the properties of SSIS component from variables and parameters.
In order to make ConnectionString as dynamic set variables to ConnectionString by using Expressions.
Once you are properties Step 1 - expand Expressions click to open Property Expression Editor and set Property Expression.
Step 2 - click and open Expression Builder.
Step 3 - Once you open Expression Builder specify the expression for this property and then click OK as shown in the following image.
How is debugging done in SSIS and how to watch the variable values in SSIS?
Debugging is the process where you would come to know what is happening inside how the values are passed within each routine of the flow. So in SSIS debugging happens at component level.We will take following example to do debugging and let's see how the whole process is carried out. Here we have two container "Foreach Loop Container" and "Load CSV (Single File)". Now in order to debug it just do right click on container and click on Edit Breakpoints.
Once you click on Edit Breakpoints you will see Break Condition and according to condition you can put the breakpoints as shown in the following figure
Now run it in debug mode by clicking it onto Debug Start Debugging or press F5 button on keyboard
After starting it in debug mode you will see brown circle with yellow arrow over where currently debug breakpoint is running. Now in order to move breakpoint further you have to have to click on Continue as shown in the following image
Now in order to watch the variables value you have to quick watch or do an add watch to as and when breakpoint moves it will also let you know values of variables with the help of quick/add watch.
What is the difference between quick watch and add watch?
Quick watch is the feature offered during the debug mode which is enabled by clicking the Debug and then click QuickWatch in order to view values of variable during each breakpoint. Whereas when clicked on Add Watch in QuickWatch window it will add watch window just next to the running program. With Add Watch enabled you will be able to see value of variable loading in Watch window during each breakpoint.
Can we see the watch windows when we are not debugging?
As watch window is feature of debugging when we have put breakpoints on the container, so it will be always available when the program is in the debugging mode. Following is the image which depicts the same of QuickWatch enabled which will show watch windows.
and not when it is developer mode or simply running program. Even when breakpoints are put on the container in non-debugging mode you will not see QuickWatch under the Debug option as you see in the following image.
How to do deployment of a running project in SSIS?
In order to do deployment in SSIS setup file is created so for doing deployment go to project development folder as shown in the following image do double click and open Deployment wizard
Click Next and select Source Project Deployment File and give path of the SSIS project i.e. ispac extension.
Now create a destination for deploying the project in SQL Server Management studio under SSISDB with a name given as MyPackage
Once folder is the created, now on Wizard screen under "Select Destination" give server name, here "KHADAK-PC" and then browse for folder path as destination to deploy the project.
Once you have selected source and destination do a final review in the next wizard screen as shown in the following image so that before you proceed with deployment final step.Review is necessary in order to avoid issues once you find everything is correct then click on the Deploy in the wizard screen.
Finally you will see Results in steps and if everything goes fine you will see Passed in the Result which means deployment is successfully done.
Once it is deployed go to SQL Server, do right click on "MyPackage" and refresh the package you will then see following image hierarchy and on expanding you will see all listed projects, packages and its files.
How to configure the package?
The answer to this question is continuation to above question and purposely we have made it as separate question so learning can be made easy. After you create the package in order to execute or run it, we first need to configure it. And for that do right click on the package as shown in the following image snapshot.
Once configure UI window is opened, then you can configure all your connection manager location for both source and destination just click on source or destination and on right side under properties you will see its Property name and it value i.e. location on your local hard drive (for now you will see default values set as and when new package is created) and here changes/altered can be done according to requirement.
If you wish to change the location of Destination package file then create a folder in your local system first or select the folder in which you want to create destination file. And to select folder under properties click and set parameter value as shown in the below image:-
Once you see following screen is open with which you can set new parameter value by clicking radio button "Edit" and fill the field with new location, click OK. So with using configure you can go and change configuration manager default value.
Can we make changes to connection managers during execute or run?
Yes we can make changes to parameters values during execute or run package.
But the parameter values will not be saved during changes made during execute or run package rather its changes made to connection manger properties during execute or run it will go to production server.
How to view report of executed package?
Report viewing option is given in SSDT when you execute or package is run successfully. You will get to see following screen when execution of package is completed. If the package contains errors those will be respectively displayed over here.
Once Yes is clicked then you will be able to see following details of complete overview report where you will find information and overview of execution.
Now we have seen report generated we will go to new destination location and check whether new file status. Once you are on the location which we earlier chosen on local hard drive you will find file created called "Destination" as you see in the following image snapshot.
How to run SSIS package as a task?
In order to make life easier of a developer how about automating the thing and running it as task. Yes you heard it correct that we can run package as task which keep on running continuously may be every day, weekly or monthly basis and keeps copying data from source to destination location.
Now to make it as background running task which is going to be carried out by SQL Server agent under object explorer -> SQL Server and now to enable it do right click on SQL Server Agent and click on Start as shown in the below image :
Now SQL Server Agent will take care of running task automatically so that no need to it manually. Once your SQL Server Agent starts now go and create new Jobs under it
Once New Job window is opened click on General available on the left side and give suitable name to New Job as here we have given it to "FileCopyJob" and keep other fields default value as it is. Followed by check to Enabled it.
After General settings are done on the page now next is to put up Steps: -
- 1) Click on the New option available on the bottom of page to create New Job Step as shown in the below figure.
- 2) Give a suitable name to the step.
- 3) Select type to SQL Server Integration Services Package.
- 4) Run it as SQL Server Agent Services Account.
- 5) Select Server Name.
- 6) Browse for the package and select it on which we have to run task of copy.
7) And click on OK and you will find New Job created
After Job with its name and type of job is selected including specifying name of the server now next on the page is to schedule it. For that we will follow below steps: -
- 1) Click on the New option available on the bottom of page to create New Job Schedule as shown in the below figure.
- 2) Give it a suitable name and tick on the Enabled.
- 3) Set Frequency, Select Occurs to “Weekly” from dropdown and tick it to every “Monday”.
- 4) Set Daily Frequency, here we have set Starting time 09:59:00 PM and Set Start Date Duration 27-04-2016.
- 5) And click OK.
Now under Jobs of SQL Server Agent you will see “FileCopyJob” created as shown in the image below.
And on the set time when task runs you will see details of the executed task shown in the below figure with date, time, Job Name and will show Server name on which it has been executed.
Now to see task executed output go to folder which was set during configuring of connection manager.
So below image depicts the set location and file created by executed task.
Troubleshooting tip: in case if you do find the created file do a refresh on explorer you should be able to see created file.
Above are the steps included to run background file copy task on package under SQL Server Job.
Which component helps to copy files from one folder to other folder?
In SSIS we have component called File System Task which will help to create file, copy file and delete files
Once File System Task component is dragged on Data Flow then right click and do Edit on the component which will open the screen as shown in the following image. Once the screen is opened then set it with appropriate name and locationfor Source and Destination.
How is SSIS project structured?
SSIS project is structured in following manner as shown in the following image where we have Solution at the top and under that we have Project or we can have no. of Projects under it and Project have SSIS package under it with dtsx files.
What is the difference between package deployment and project deployment?
Package deployment | Project deployment |
This is the old way of doing SSIS deployment which was continued till SQL 2008 but now its very rare. | This is the new way of doing SSIS deployment which started from SQL 2012. |
In Package deployment individual package are deployed which is dtsx file. | In Project deployment complete project is deployed in one go. |
Package deployment has MSDB and File System which is deployed in Integration Services. | Project deployment SSISDB where entire project is deployed in SQL Server. |
In Package deployment only dtsx files can be deployed. | In Project deployment you can deploy complete project and dtsx files. |
Setting parameters at project level or package level i.e. dtsx file then use Project.params. | In multiple projects scenario if you want to set global parameters then use Environment. |
How to set a global parameter to all packages running in the same SQL Server?
When you have many or multiple projects at that time parameters are set as global through Environment.
So in other words if you want to create global parameters for all your packages in projects or across projects the answer to it is by creating environment.
So in order to achieve this go to your SQL Server Management Studio and do right click on "Environment" and create an environment by clicking on "Create Environment".
Once you click on "Create Environment" it will open as screen for creating environment.
In this screen give a suitable name to the environment here we have given “MyPackage” and then click OK as shown in the below screen -
Once you have created new Environment as "MyPackage" under the Environment then do right click and open its Properties as you see in the image below: -
It will open Properties windows where you have to set Variables by clicking on Variables on the left and giving suitable Name to that Variable on the right side, set “Type” as “String” and set “Value” which means location where the file is to be created, so write complete path “D:\MSBI\Loc2\Destt.txt” below is the image which depicts setting of Variables under Environment Properties.
Now in order to make it as your global parameter, under Object Explorer go to Projects -> FileCopyProgram on this now do right click and click on "Configure"
Once Configure screen is open, browse to Set Parameter Value and under Value select “Use environmental variable and set it’s field from dropdown to “Destination” and then click on OK.
Your global parameter by setting its value to environmental variable
Now under Parameters you will see new Value to Destination as shown in the below image which means that “FileCopyProgram” is now set Value as Destination.
And Next to Parameters you will see “Connection Managers” to set value in the same manner which you did for setting value of Parameters.
Just browse and set new values for Connection Managers if you wish to change it or keep it as it is if you want to let it remain unchanged.
And then finally click on OK.
Now if you go to Properties of “MyPackage” go to Values of Variables and this is going to be deployed location.
At the above location which we have seen under Variable Environment Properties you will deployed file after the Package is executed.
Just remember to tick on Environment and then click on OK.
You will see report of successful execution and if you go to output folder where you will see deployed file named “global.txt” as shown in the image below.
So quickly revising if you want to set parameters at project level which means to your dtsx file then use project.params
And if you want to set parameters at global level or multi-project level then use environmental variables.
How to deploy individual package (dtsx file)?
There are scenarios where several or single dtsx file have made changed by developers and in order to deploy those files individually we have option called “Import Packages” under “Object Explorer” of SQL Server Management Studio.
Once the wizard screen is open the very first page of the wizard is “Locate Package” as you see in the following screen. It is a wizard screen where you will browse and locate the path of single package or dtsx file from your system. After you locate the file click Next to go to following page and then screen by screen fulfill the requirements as shown in the image below. Very first screen is of locating the package on local system followed by other simple wizard screen.
How to convert project deployment to package deployment model?
Earlier version before to SQL 2012 there were deployment done at package level only i.e. package deployment. From SQL 2012 onwards Microsoft has made project deployment as default hence SQL Server 2012 onwards only project deployment is used as it has become default option.
So in case if you have requirement or such circumstances where need has arisen to move or do the package deployment then there is provision available under SSDT VS 2013 for it.
In order to achieve package level deployment from project level go to project under Solution Explorer of SSDT 2013 do right & click on “Convert to Package Deployment level” as shown in the following image: -
What are the different database design techniques for OLAP?
OLAP which is de-normalized and database designing technique to develop dataware house.
Preferred way of MSBI is to use OLAP(Online Analytical Programming) through Star Schema and Snowflakes which serves purpose for faster retrieval of data for doing analysis. So two common techniques used are Star Schema and Snowflakes in MSBI.
So with the help of following diagram we will start understanding Star Schema design. In simple words defining Star Schema is where we have one or two fact or measures table connected by many dimensions through primary and foreign key relationship. Most important thing is that dimensions are not connected with other dimension in Star Schema.
Why is normalization design not good for OLAP database?
Normalization means OLTP(Online Transaction Programming) which serves benefits of restricting from duplicate data getting entered into the database fields and it also ensures removal of redundancy within data. It is quiet essential when data comes into SQL system especially when doing insert, update and delete in such scenarios to avoid duplication and redundancy of external data. Whereas in MSBI our main need is to derive intelligent information, do analysis on it to retrieve data as fast possible and do forecasting so with normalization(OLTP) design technique our goal of serving MSBI need won’t be fulfilled.
In such circumstances de-normalization(OLAP) suits perfectly for MSBI which serves our purpose reading the data and fast retrieval of data, fetching fast records and doing analysis as soon as possible so widely used and preferred methods are Star Schema and Snow flakes.
What are dimensions and measures?
Dimensions and Measures are concept which are equivalently important to understand before proceeding learning of Star Schema and Snow flakes methods. Foremost thing we would like to bring in to picture is that analysis is done on numeric values i.e. numbers and using this number value we do forecasting. For different domain numbers are different if you count census then population are numbers, for account profit and loss are numbers, for company sales figures i.e. numbers are important aspect on which forecasting will be done.
With this numbers it is very true only numbers do not make sense alone so in order to sound useful there has to be some reference to it or some context like sales figure(number) as per country wise, year wise or product wise.
So now defining Measures in simple words, numbers on which we tend to do analysis to do forecasting it is also known as Facts.
And now stating Dimension which are nothing but reference or context which speaks all about Measures or Fact.
What is the difference between star schema and snow flake?
Star Schema | Snowflake Schema |
Star Schema model uses de-normalized data. | Snowflake model uses normalized data. |
In Star Schema you will see that dimension table are connected to Measures or Fact table directly. | In Snow flake you will also see that dimension table are connected to Measures or Fact table but also you will see dimension table having relationship other with dimension table. |
In Star Schema you will less complication due to less no. of joins as there are no relation between dimension table to dimension table. | In Snow flake there are more complication and no. joins as you will also find relation between dimension table to dimension table. |
Due to no relation between two dimension tables there are less no. of foreign keys, hence it faster compare to Snowflake.. | More relation between each dimension table using foreign key which in turns down to slow compare to Star Schema. |
Here Parallelism can be achieved as there is no direct relation between two dimensions. | Due to dependency between dimensions here you will not find parallelism. |
How to share connection manager across different tasks?
In a scenario where there are many packages in a project and all them getting connected to same database so there will be same connection manager for each package under such condition where we have same connection manager and creating connections again and again how about creating a common global connection manager and share it across different packages to perform task. Following is the image which depicts the same.
Now we will see practical steps how to do the same on visual studio. Go to Solution Explorer if already open or click on View in the Menu program and click on Solution Explorer.
Under Solution Explorer do right click on Connection Manager and do click on “New Connection Manager”
New page of Add SSIS Connection Manager you will see as shown in the below image and then select connection manager type as "ADO.NET" and then click on Add.
After clicking on Add we will move to next page of “Configure ADO.NET Connection Manager” here in this page need to select Data connections if present in the Data connections if present or else need to click “New” to select new server and add it as Data connections. Once server/database name is selected as Data conenctions and click on OK.
Now you will see database name added below global connection manager which is now been set for all packages.
So now if you go and edit ADO.NET Destination in Editor to configure its connection manager 7666266 - 0834,
How to disable and enable task on a package?
While working on SSIS package if you want Disable and enable task or any of its container that can be done easily achieved on control flow pane. Below is the image snapshot which shows how to enable or disable task on a package. Just do right click on container which will show us various right click option on it just click on “Disable” which will in turn instantly disable the current active container. This feature of MSBI is very useful whenwe debug package for any specific task irrespective of deleting other tasks. Once done with debugging, enablingof Tasks can be done easily likewise similarly as we did for disabling it.
How to execute a package?
In order to execute the SSIS package, right click the package named Main.dtsx within Solution Explorer and select “Execute Package” option from the right click available list as shown in the image below.
If we have more than one package how to run it as start up?
In a big MSBI project there are many packages under it and if you want make any one package as starting package then it can be done as setting it as startup package likewise as shown in the following image.
1) Go to Solution Explorer.
2) Within Solution Explorer, under Solution you will find the hierarchy and under to it you will see Project followed by SSIS Packages where you will see many package select the package which you want to make it as startup page here we have selected "Main.dtsx" as we want it be startup. After selecting it do right click on it
3) Once you do right click on it you will see "Set as StartUp Object".
Once you set it as startup from next time when you debug or execute project this package will run first.How to refresh fields in input source “Flat File Source” when input CSV or txt file changes?
For existing connected CSV or txt file currently being used in the project if it is changed with more input fields added from customer side as shown in the below images then how accommodate that changes in input source component. The answer to this is to reset columns in connection manager as it is responsible read from actual source CSV or txt file
For such important changes with added new fields to CSV or text files follow as mentioned below to update it in input source component "CustomerFact txt".
1) Edit connection manager by double clicking on it to open.
2) In the Editor screen go and click on "Columns".
3) Click on Reset column to update all new columns exactly as new CSV or text file has.
4) Click OK to close the connection manager editor screen with updated new component in accordance to CSV or txt file.
What is the datatype for unicode or non-unicode in SSIS?
The datatype for Unicode in SSIS is (Unicode string[DT_WSTR]) can be seen under Data Conversion component in SSIS. Just do double click on Data Conversion component which will open its Transformation Editor.
Similarly for non-unicodedatatype in SSIS is denoted by string[DT_STR].
What is the use of Lookup component?
Lookup component actually lookup by joining data in input column with columns in other table column which it is referring to. Lookup let us to get value related to current requirement without creating separate table to get that value. From CSV or text file it will lookup for text into database and in return it will get ID to that corresponding text, this is the prime purpose which lookup serves.
For example requirement here is to get dimension value and display it on Fact table, so here ID get displayed on FactTable when lookup is called. With the help of primary and foreign key relationship on database this can be achieved. Below image demonstrate the same Country table, States table, Product table and SalesPerson table corresponding ID are getting displayed on Fact table
How to continue execution of program if records are not found?
By default program gets halted and stops execution of the package if no records are found. Below image demonstrates the same where lookup does not move ahead and if records are not found and continues ahead if record persist.
While working with lookup component this issue arises if it does not find record it stops ahead with red cross mark on the component and error line in progress pane
And to get out of this hindrance some changes need to be implemented so that the program should move ahead without stopping the execution
Double click the component "LookupCountry" to open it in Edit mode and then on left side click on “General” and then click and select “Ignore Failure” under How to handle rows under No Match entries.
This change will make sure that program execution will continue even if the rows entries are not matched.
If we have missed out any question to answer feel free to contact us. We also provide one-to-one personalised msbi training and interview preparation. If you are looking for feel free to contact us at questpond@questpond.com or call us at +91-9967590707.
Gurunatha Dogi
Gurunatha Dogi is a software engineer by profession and founder of Onlinebuff.com, Onlinebuff is a tech blog which covers topics on .NET Fundamentals, Csharp, Asp.Net, PHP, MYSQL, SQL Server and lots more..... read more