Data Mining
Essay by JungleKatC • August 11, 2013 • Essay • 1,186 Words (5 Pages) • 1,457 Views
The purpose of association analysis is to find patterns in particular in business processes and to formulate suitable rules. Association analysis is useful for discovering relationships hidden in large amounts of data and helps to identify cross-selling opportunities. There are two things to remember when using association analysis with regard to market data: discovering patterns from a large transaction data set can be computationally expensive and some of the discovered patterns are potentially spurious because they may happen simply by chance.
Association discovery finds rules about items that appear together in an event, such as a purchase transaction. Market-basket analysis is a well-known example of association discovery that is used for recommendation engines. Recommendation engines are used to suggest products to customers based their behaviors and their social circle. The algorithm in the engine processes the available information and provides real-time and personalized recommendations. These recommendations are tailored to respond dynamically to each user and differ in real time based on the user's online activities. (www.infosys.com)
Web data mining is a process that allows companies to recognize patterns of customer behavior and act on these findings, providing an opportunity to target marketing efforts and increase sales. The process extracts structured information from unstructured or semi-structured web data sources. Companies use web data mining as a tool to gather data from different websites and collate it together to do analysis and build websites that provide information from different websites. For business intelligence, competitive e-commerce market and the massive number of options customers have today have forced businesses to employ marketing strategies that are built largely on data mined from web mining. Business intelligence keeps a business informed of market trends, alerts about new avenues of generating revenue, and helps determine the status of the competition.
Clustering is a method of grouping customers together so that the customers within a group, or cluster, are more similar to each other than to customers outside the group. That is, the difference between members of different clusters is greater than the differences between members of the same cluster. (www.nesug.org, 2006)
Businesses today collect information about what pages site users visit and the sequence in which the pages are visited. Because the business provides online ordering, customers must log in to the site which provides the company with information for each customer profile. By using a clustering algorithm on this data, the business can find groups, or clusters, of customers who have similar patterns or sequences of clicks. The business can then use these clusters to analyze how users move through the Web site, to identify particular timeframe that customers are more likely to make purchases, to determine geographical clusters, or to determine the best or worst customers. Cluster analysis offers a wealth of opportunities for investigating a customer database for trends over time, geographic patterns, or identifying the profitable customers from the un-profitable. Distance can be thought of in many ways and as such the possibilities are almost limitless
Reliability of the data mining algorithms in increased with sufficient testing procedures. Measures of data mining generally fall into the categories of accuracy, reliability, and usefulness. Accuracy is a measure of how well the model correlates an outcome with the attributes in the data that has been provided. There are various measures of accuracy, but all measures of accuracy are dependent on the data that is used. In reality, values might be missing or approximate, or multiple processes might have changed the data. Particularly in the phase of exploration and development, you might decide to accept a certain amount of error in the data, especially if the data is uniform in its characteristics. For example, a model that predicts sales for a particular store based on past sales can be strongly correlated and very accurate, even if that store consistently used the wrong accounting method. Therefore, measurements of accuracy must be balanced by assessments of reliability.
Reliability assesses the way that a data mining model performs on different data sets. A data mining model is reliable if it generates the same type of predictions or finds the same general kinds of patterns regardless of the test data that is supplied. For example, the model that you generate for the store
...
...