Rm Module4.docx

  • Uploaded by: Afreen
  • 0
  • 0
  • June 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Rm Module4.docx as PDF for free.

More details

  • Words: 3,953
  • Pages: 9
MODULE –IV: ADVANCED STATISTICAL TECHNIQUES INTRODUCTION: - In nature we find number of variables inter- related to one another. For example, amount of rainfall to certain extent and production of paddy, heat and volume of gas, price and demand of a commodity in the market etc. Correlation theory aims at finding the degree of relationship that exist between the variables. A statistical tool with the help of which we can find the degree of relationship that exists between two or more variables is technically called correlation.

Q1. CHARACTERISTICS OF CORRELATION Below are some characteristics about the correlation. 1. The correlation of a sample is represented by the letter. 2. The range of possible values for a correlation is between -1 to +1. 3. A positive correlation indicates a positive linear linear association like the one in e.g. the strength of the positive linear association increase as the correlation becomes closer +1. 4. A negative correlation indicate a negative linear association. The strength of the negative linear association increases as the correlation become closer to -1. 5. A correlation of either +1 or -1 indicated a prefect linear relationship. This is hard to find with real data. 6. A correlation of 0 indicated either that: there is no linear relationship between the two variables, and / or the best straight line through the data is horizontal

Q2. TYPES OF CORRELATION 1. POSITIVE CORRELATION: if the two variables correlated are moving in the same direction then correlation is called positive i.e. if one variable increases, the other variable also increase or if one variable decreases, the other variable also decreases. For e. g demand and supply are positively correlated for if demand increases supply increases, if demand decreases supply decreases I. e. both the variables demand and supply are moving in the same direction. 2. NEGATIVE CORRELATION:-if the two variables correlation are moving in opposite directions then the correlation is called negative, i. e if one variable decrease, the other variable increase. For e.g. price and demand is negatively correlated for if price increase demand decreases and if price decrease s, demand increases. 3. ZERO CORRELATION: - if there is no correlation between the two variables then the correlation is called zero correlation or spurious correlation. For example, marks scored by a student in tests and the amount of rain fall. 4. LINEAR CORRELATION: - if the ratio of the amount of charge in one variable to the amount of charge in the other variable, bears a constant ratio throughout then, the correlation is said to be linear. This type of correlation is found only in scientific variables like heat volume of gas. 5. NON- LINEAR OR CURVILINEAR CORRELATION: - if the ratio of the amount of change in one variable to the amount of change in the other variable does not bear a constant ratio throughout then,

correlation is said to be non- linear or curvilinear. Most of the variables other than scientific variables show non - linear correlation.

Q. 3) Methods Of Studying Correlation? Following are the important methods of studying correlation. 1. Scatter diagram method. 2. Karl Pearson’s method. 3. Rank correlation method. 4. Method of least squares. 1. Scatter diagram method-It is a non-mathematical method of studying correlation between two variables. It gives a rough degree of correlation as well as the direction of the correlation. If the paired observations in the data as co-ordinates are plotted on the graph, we get a scatter of points on the plane.By studying the scatter of points; we can roughly estimate the extent of correlation. Merits and Demerits: Merits: 1. As it is non-mathematical method, it can be understood very easily. 2. Just by looking at the scatter of points we can have a rough idea about the existence of correlation. Demerits: 1. As it is a non-mathematical method, we cannot measure exact degree of correlation. 2. Interpretation of the diagram depends on the subjective judgment of the person. 2. [Karl Pearson's coefficient of correlation] Karl Pearson has given formula to determine the extent of correlation between two related variables. This co-efficient of correlation is computed by dividing the product of all the deviations of each pair of observations from their respective means by the product of the standard deviations of the variables and number of items, symbolically: 

R=

Q.4 Regression Analysis? In "correlation", we studied how to find the extent of cause and effect relationship between two variables X and Y.The theory of correlation gives only the degree of relationship between two variables but No the nature of relationship. That is it does not tell which is the cause and which is the effect. This is indicated by study off regression. Regression is a statistical mattered with the help of which we can estimate value of 1 variable for the given value of the other variable. For example, if we know that the two variables demand and supply are corelated, with the help of regression theory we can estimate the most probable value of demand

for given value of supply or we can estimate the most probable value of demand for given value of supply or we can estimate the most probable value of supply for the given value of demand. Regression phenomenan was first noted by Sir Francis Galton. Q.5. Distinguish between correlation and regression.

Q 6.Application of Regression? i)Predictive Analysis: Predictive analysis i.e. forecasting future opportunities abd risks is the most prominent application of regression analysis in business .Demand analysis, for instance, predicts the number of items which a consumer will probably purchase. For example ,we can forecast the number of the shoppers who will pass in front of a particular billboard and the use that data to estimate the maximum to bid for an advertisement ii) Operation Efficiency: Regression models can also be used to. optimize business processes. A factory manager , for example, can create a statistical model to understand the impact of oven temperature on the shelf life of the cookies baked in those ovens. In a call center, we can analyze the relationship between wait times of callers and members of complaints iii) Supporting Decisions: Business today are overloaded with data on finances, operations and customer purchases. Increasingly, executives are now learning on data analytics to. make informed business decisions thus eliminating the intuition and gut feel. Regression analysis can bring a scientific angle to the management of any businesses iv) Correcting Errors: Regression is not only great for lending empirical support to management decisions but also for identifying errors in. judgment. For example, a retail store manager may believe that extending shopping hours will greatly increase sales. Regression analysis, however, may indicate that the increase in revenue might not be sufficient to support the rise in operating expenses due to longer working hours (such as additional employee labour charges).Hence, regression analysis can provide quantative support for decisions and prevent mistakes due to manager's institution v) New Insights: Over time businesses have gathered a large volume of unorganized data that has the potential to yeild valuable insights. However, this data is useless without proper analysis. Regression analysis techniques can find a relationship between different variables by uncovering patterns that were previously unnoticed. For example, analysis of data from point of sales systems and purchase accounts may highlight market patterns like increase in demand om certain days of the week or at certain times ot the year.

Q7) Factor Analysis Introduction: Factor Analysis is a method for modelling observed variables, and their covariance structure, in terms of a smaller number of underlying unobservable (latent) "factors" .The factors typically are viewed as broad concepts or ideas that may describe an observed phenomenon. For example, a basic desire of obtaining a certain social level might explain most consumption Factor analysis is a way to take a mass of data and shrinking it to a smaller data set that is more manageable and more understandable. The two types: exploratory and confirmatory.

i)Exploratory factor analysis is if you do not have any idea about what structure your data is or how many dimensions are in set of variables ii)Confirmatory factor Analysis is used for verification as long as you have a specific idea about what structure your data is or how many dimensions are in set of variables. The key concept of factor analysis is that multiple observed variables have similar patterns of responses because they are all associated with a latent (i.e not directly measured)variables. For example, people may respond similarly to questions about income, education and occupation which are all associated with the latent variable socio economic status.

Q8) Basic Terms Relating To Factor Analysis i) Factor: A factor is an underlying dimension that account for several observed variables. ii)Communality (h2):Communality, symbolized as h2, shows how much of each variable is accounted for by the underlying factor taken together. iii) Eigen value ( or latent root): When we take the sum of squared values of factor loadings relating to a factor, then such sum is referred to as Eigen value or latent root. Eigen value indicates the relative importance of each factor in accounting for the particular set of variables being analysed. iv) Total sum of squares: When given values of all factors are totalled, the resulting value is termed as the total sum of squares. v) Rotation: Rotation, in the context of factor analysis, is something like staining a microscope slide. Just as different stains omit reveal different structures in the tissues, different rotations give results that appear to be entirely different, but from statistical point of view, all the results are taken as equal, none superior or inferior to others. However, from the standpoint of making sense of the results of factor analysis, one must select and the right rotation. If the factors analysis, one must select the right rotation. If the factors are independent orthogonal rotation is done and if the factors are correlated, an oblique rotation is made. vi) Factor -loadings: loadings are those values which explain how closely the variables are related to each one of the factors discovered. They are also known as factor-variable correlations.

Q.9) Application of factor analysis i) Interdependency and pattern delineation : If a scientist has a table of data--say, UN votes, personality characteristics, or answer to a questionnaire --and if he suspects that these data are interrelated in a complex fashion, then factor analysis may be used to untangle the linear relationship into their separate patterns. ii) parsimony or data reduction : Factor analysis can be useful for reducing a mass of information to an economical description. For example, data on fifty characteristics for 300 nations are unwieldy to handle, descriptively or analytically. Nations can be more easily discussed and compared on economic development, size, and politics dimensions, for example, than on the hundreds of characteristics each dimension involves.

iii) structure : Factor analysis may be employed to discover the basic structure of a domain. As a case in point, a scientist may want to uncover the primary independent lines or dimensions--such as size, leadership, and age--of variation in group characteristics and behaviour. iv) Classification or description : Factor analysis is a tool for developing an empirical typology. 7 It can be used to group interdependent variables into descriptive categories, such as ideology, revolution, liberal voting and authoritarianism. It can be used to classify nation profiles into types with similar characteristics or behavior. V) scaling: A scientist often wishes to develop a scale on which individuals, groups, or nations can be rated and compared. The scale may refer to such phenomenon as political participation, voting behaviour, or conflict. A problem in developing a scale is to weight the characteristics being combined. Factor analysis offers a solution by dividing the characteristics into independent sources of variation (factors ).Each factor then represents a scale based on the empirical relationship among the characteristics. Vi) Hypothesis testing : Hypotheses abound regarding dimensions of attitude, personality, group, social behaviour, voting, and conflict. Since the meaning usually associated with "dimension" is that of a cluster or group of highly intercorrelated characteristics or behaviour, factor analysis may be used to test for their empirical existence. Vii) Data transformation: Factor analysis can be used to transform data to meet the assumptions of other techniques. A large number of dependent variables also can be reduced through factor analysis. Viii) Exploration : In a new domain of scientific interest like peace research, the complex interrelation of phenomena have undergone little systematic investigation. The unknown domain may be explored through factor analysis. It can reduce complex interrelationships to relatively simple linear expression. ix)mapping: Besides facilitating exploration, factor analysis also enables a scientist to map the social terrain. These concepts may then be used to describe a domain or to serve as inputs to further research. Some social domain, such as international relations, family life, and public administration, have yet to be charted. In Some other areas, however, such as personality, abilities, attitudes, and cognitive meaning, considerable mapping has been done.

Q 10) cluster analysis Introduction : Cluster analysis is a data exploration (mining) tool for dividing a multivariable dataset into "natural" clusters (groups). We use the methods to explore whither previously undefined clusters (group) exist in the dataset. For instance, a marketing department may wish to use survey results to sort its customers into categories. Cluster analysis is multivariate method which aims to classify a sample of subject on the basis of a set of measured variable into a number of different groups such that similar subjects are placed in the same group. An example where this might be used is in the field of psychiatry, where the characterisation of patients in the basis of clusters of symptoms can be useful in the identification of an appropriate for me of therapy. In marketing, it may be useful to identify distinct group of potential customers so that, for example, advertising can be appropriate targeted.

Cluster analysis is an exploratory analysis that tries to identify structure within the data. Cluster analysis is also called segmentation analysis or taxonomy analysis.

11). AAPLICATION OF CLUSTER ANALYSIS On PET scans, cluster analysis can be used to differentiate between different types of Tissue in a threedimensional image for many different purpose 1. Market research: Cluster analysis is widely used in market research when working with multivariate data from surveys and test panels . Market researchers use cluster analysis to partition the general population of consumer into market segments and better understand the relationship between different groups of consumer/potential customers 2. Social network analysis: In the study of social network, clustering may be used to recognize communities within large groups of people. 3. Search result grouping: In the process of intelligent grouping of the files and website , clustering may used to create a more relevant set of search results compared to normal search engines like Google 4. Software evolution: It is a form of restructuring and henceis way of direct preventative maintenance. 5. Recommender system: Recommender system are designed to recommend new item based on a user's tastes. They Sometimes use clustering algorithms to predict a user's preference based on the preferences Of other users in the user's cluster. 6. Crime analysis : Cluster analysis can be used to identify areas where there are greater incidences of particular types of crime .By identifying these distinct areas or " hot spots" where a similar crime has happened over a period of time. 7. Educational data mining: Cluster analysis is for example used to identify groups of schools or students with similar properties. 8. Climatology: To find whether regimes or preferred sea level pressure atmospheric patterns. 9. Petroleum geology : Cluster analysis is used to reconstruct missing bottom hole core data.

V) physical geography: The clustering of chemical properties in different sample location s.

12. DISCRIMINANT ANALYSIS introduction Discriminant analysis is a regression based statistical technique used in determining which particular classification or group (such as 'ill' or healthy') an item of data or an object (such as a patient) belongs to on the basis of its characteristics or essential features. It differs from group building techniques such as cluster analysis in that the classification or groups to choose from must be known in advance. Discriminant analysis is a form of multivariate analysis in which the objective is to establish a discriminant function. The function (typically a mathematical formula) discriminates between individuals in the population and allocates each of them to a group within the population. The function is established on the basis of a series of measurements or observations made on the individuals. Two objectives Discriminate Analysis may be used for two objectives: (1) When we want to assess the adequacy of classification, group memberships of the objects under study. (2) When we wish to assign objects to one of a number of (known) groups of objects.

Q13.CHARACTERISTICS OF DISCRIMINANT ANALYSIS 1) Essentially a single techniques consisting of a couple of closely related procedures. 2) Operates on data sets for which pre-specified, well defined groups already exist. 3) Assesses dependent relationship between one set of discriminating variables and a single grouping variables; an attempt is made to define the relationship between independent and dependent variables. 4) Extracts dominant, underlying gradients of variation (canonical functions) among groups of sample entities (e.g. species, sites ,observations ,etc.)from a set of multivariate observations, such that variation among groups is maximized and variation within groups is minimized along the gradient. 5) Reduces the dimensionality of a multivariate data set by condensing a large number of original variables into a smaller set of new composite dimensions (canonical functions) with a minimum loss of information. 6) Summarizes data redundancy by placing similar entities in proximity in canonical space and producing a parsimonious understanding of the data in terms of a few dominant gradients of variation. 7) Describes maximum differences among pre-specified groups of sampling entities based on a suite of discriminating characteristics (i.e, canonical analysis of discrimination). 8) Predicts the group membership of future samples, or samples from unknown groups, based on a suite of classification characteristics (i.e, classification).

9) Extension of multiple regression analysis if the research situation defines the group categories as dependent upon the discriminating variables, and a single random sample (N) is drawn in which group membership is "unknown" prior to sampling. 10)Extension of multivariate analysis of variance if values on the discriminating variables are defined as dependent upon the groups, and separate independent random samples(N1,N2,.....) of two or more distinct populations (i.e groups) are drawn in which group membership is "known" prior to sampling.

Q14) APPLICATIONS OF DISCRIMINANT ANALYSIS Applications of discriminant analysis are as the follow: 1) Bankruptcy prediction: In bankruptcy prediction based on accounting ratio and other financial variables, linear discriminant analysis was the first statistical method applied to systematically explain which firms entered bankruptcy vs. Survived. 2) Face recognition: In computerized face recognition, each face is represented by a large number of pixel values. Linear discriminant analysis is primarily used here to reduce the number of features to a more manageable number before classification. Each of the new dimensions is a linear combination of pixel values, which form a template. The linear combinations obtained using fisher's linear discriminant are called fisher faces, while those obtained using the related principal component analysis are called eigenfaces. 3) Marketing: In marketing, discriminant analysis was once often used to determine the factors which distinguish different types of customers and/or products on the basis of surveys or other forms of collected data. 4) Biomedical studies: The main application of discriminant analysis in medicine is the assessment of severity state of a patient and prognosis of diseases outcome. For example, during retrospective analysis, patients are divided in two groups according to severity of diseases- mild, moderate and severe form.

Q15) MULTIDIMENSIONAL SCALING Multidimensional scaling is a visual representation of distances or dissimilarities between sets of objects. "Objects" can be colors, faces, map coordinates, political persuasion, or any kind of real or conceptual stimuli. Objects that are more similar (or have shorter distance) are closer together on the graph than objects that are less similar (or have longer distances). The term scaling comes from psychometrics, where abstract concepts ("objects") are assigned number according to a rule. For example, you may want to quantify a person's attitude to global warming. You could assign a "1" to "doesn't believe in global warming", a 10 to "firmly believes in global warming" and a scale of 2 to 9 for attitudes in between. Multidimensional scaling (MDS) is a means of visualizing the level of similarity of individual cases of a dataset. STEPS IN CONDUCTING MDS: Following are the steps in conducting MDS research: 1) Formulating the problem: what variables do you want to compare? How many variables do you want to compare? What purpose is the study to be used for?

2) Obtaining input data: Respondents are asked a series of question. For each product pair they are asked to rate similarity (usually on a 7-point liner scale from very similar to very dissimilar). The first question could be for coke /Pepsi for example, the next for coke/hires root beer, the next for Pepsi/Dr pepper, the next for Dr pepper/Hires root beer, act. 3) Running the MDS statistical program: Software for running the procedure is available in many statistical software packages. 4) Decide number of dimensions: The researcher must decide on the number of dimensions they want the computer to create. The more dimensions, the better the statistical fit, but the more difficult it is to interpret the results. 5) Mapping the results and defining the dimensions: The statistical program (or a related module) will map the results. The map will plot each product (usually in two-dimensional space). The proximity of products to each other indicates either how similar they are or how preferred they are, depending on which approach was used. 6) Test the results for reliability and validity: compute R-squared to determine what proportion of variance of the scaled data can be accounted for the MDS procedure.

Q16) APPLICATION OF MULTIDIMENSIONAL SCALING 1) Scale construction: MSD gives a composite picture about how the respondent views the object or brand or city etc.when compared to other in the category. This can be done using similarity or preference data the researcher tries to name the the dimensions that could have been the basis of the comparisonfor example in the illustrations about cities, the researchers may feel that the two dimensions used by the respondent were -(i) City culture and (ii) job opportunities. 2) Brand Image Analysis: Many marketers use the technique to measure the possible gaps between the companies’s or brand's positioning with the consumer's brand image perception. 3) Development of New Product: MDS is one of the most powerful tools to be used at the idea generation or concept testing stage. It helps us to identify quadrants that are less crowded and where a clear product launch opportunities exists. If the product team has come up with more than one probable concept the preference of the consumers regarding these could be used by placing the preference on special map to see which concept finds higher acceptability on multiple dimensions. 4) Pricing studies: the marketer can use subjective maps to assess whether price is making a difference to the preference or demand of the brand by measuring a spatial map of the competing brand with and without the criteria of price to assess whether the positioning of the brand is affected by price or not. 5) Assessing Communication Effectiveness: The brand manager could design before and after study to assess the placement of the brand before and after specific repositioning or a new advertising campaign to see the impact of the same on the brand perception.

Related Documents

Rm
June 2020 36
Rm Pf74ac Rm Pf83ac Ee
November 2019 51
Bucataria Rm
October 2019 54
Rm Overview
August 2019 33
Rm 18
June 2020 20
Rm Gadar.docx
May 2020 27

More Documents from "syaiful rakhman"

Rm Module4.docx
June 2020 1