Book Detail : Mastering Clojure Data Analysis

Book Title: 
Mastering Clojure Data Analysis
Resource Category: 
Publisher: 
Publication Year: 
2 014
Number of Pages: 
340
ISBN: 
978-1-78328-413-9
Language: 
English
WishList: 
yes
Available at Shelf: 
No
Description: 

Leverage the power and flexibility of Clojure through this practical guide to data analysis

Table of Contents (Summary): 
  1. Network Analysis – The Six Degrees of Kevin Baco

  2. GIS Analysis – Mapping Climate Change

  3. Topic Modeling – Changing Concerns in the State of the Union Addresses

  4. Classifying UFO Sightings

  5.  Benford's Law – Detecting Natural Progressions of Numbers

  6. Sentiment Analysis – Categorizing Hotel Reviews

  7. Null Hypothesis Tests – Analyzing Crime Data

  8. A/B Testing – Statistical Experiments for the Web

  9. Analyzing Social Data Participation

  10. Modeling Stock Data

Index  

Table of Contents (Expanded): 
  1. Network Analysis – The Six Degrees of Kevin Baco

    • Analyzing social networks 

    • Getting the data 

    • Understanding graphs 

    • Implementing the graphs  

      • Loading the data 

    • Measuring social network graphs  

      • Density 

      • Degrees  

      • Paths 

      • Average path length 

      • Network diameter  

      • Clustering coefficient 

      • Centrality 

      • Degrees of separation  

    • Visualizing the graph 

      • Setting up ClojureScript 

      • A force-directed layout  

      • A hive plot  

      • A pie chart  

    • Summary 

  2. GIS Analysis – Mapping Climate Change

    • Understanding GIS 

    • Mapping the climate change 

      • Downloading and extracting the data 

        • Downloading the files 

        • Extracting the files 

      • Transforming the data – filtering 

      • Rolling averages 

        • Reading the data 

      • Interpolating sample points and generating heat maps using inverse distance weighting (IDW) 

    • Working with map projections 

      • Finding a base map  

    • Working with ArcGIS 

    • Summary 

  3. Topic Modeling – Changing Concerns in the State of the Union Addresses

    • Understanding data in the State of the Union addresses 

    • Understanding topic modeling 

    • Preparing for visualizations 

    • Setting up the project 

    • Getting the data 

      • Loading the data into MALLET 

      • Visualizing with D3 and ClojureScript 

      • Exploring the topics  

        • Exploring topic 43  

        • Exploring topic 26  

        • Exploring topic 42  

    • Summary 

  4. Classifying UFO Sightings

    • Getting the data 

    • Extracting the data 

    • Dealing with messy data 

    • Visualizing UFO data  

    • Description 

    • Topic modeling descriptions  

    • Hoaxes 

      • Preparing the data 

        • Reading the data into a sequence of data records  

        • Splitting the NUFORC comments  

        • Categorizing the documents based on the comments 

        • Partitioning the documents into directories based on the categories 

        • Dividing them into training and test sets 

      • Classifying the data  

        • Coding the classifier interface 

        • Running the classifier and examining the results 

    • Summary 

  5.  Benford's Law – Detecting Natural Progressions of Numbers

    • Learning about Benford's Law 

      • Applying Benford's law to compound interest  

      • Looking at the world population data 

    • Failing Benford's Law 

    • Case studies 

    • Summary 

  6. Sentiment Analysis – Categorizing Hotel Reviews

    • Understanding sentiment analysis 

    • Getting hotel review data  

    • Exploring the data  

    • Preparing the data  

      • Tokenizing  

      • Creating feature vectors 

      • Creating feature vector functions and POS tagging 

    • Cross-validating the results 

    • Calculating error rates 

    • Using the Weka machine learning library 

      • Connecting Weka and cross-validation

      • Understanding maximum entropy classifiers 

      • Understanding naive Bayesian classifiers 

    • Running the experiment 

    • Examining the results 

      • Combining the error rates 

    • Improving the results 

    • Summary 

  7. Null Hypothesis Tests – Analyzing Crime Data

    • Introducing confirmatory data analysis  

    • Understanding null hypothesis testing 

      • Understanding the process  

        • Formulating an initial hypothesis 

        • Stating the null and alternative hypotheses  

        • Determining appropriate tests 

        • Selecting the significance level 

        • Determining the critical region 

        • Calculating the test statistics and its probability 

        • Deciding whether to reject the null hypothesis or not  

      • Flipping coins 

        • Formulating an initial hypothesis 

        • Stating the null and alternative hypotheses  

        • Identifying the statistical assumptions in the sample  

        • Determining appropriate tests 

    • Understanding burglary rates 

      • Getting the data 

      • Parsing the Excel files  

      • Pulling out raw data  

        • Growing a data tree  

        • Cutting down the data tree  

        • Putting it all together 

        • Transforming the data  

        • Joining the data sources  

        • Pivoting the data 

        • Filtering the missing data 

        • Putting it all together 

    • Exploring the data  

      • Generating summary statistics 

        • Summarizing UNODC crime data  

        • Summarizing World Bank land area and GNI data  

      • Generating more charts and graphs  

    • Conducting the experiment  

      • Formulating an initial hypothesis 

      • Stating the null and alternative hypotheses  

      • Identifying the statistical assumptions in the sample  

      • Determining appropriate tests 

        • Understanding Spearman's rank correlation coefficient 

      • Selecting the significance level 

      • Determining the critical region 

      • Calculating the test statistic and its probability 

      • Deciding whether to reject the null hypothesis or not  

    • Interpreting the results  

    • Summary 

  8. A/B Testing – Statistical Experiments for the Web

    • Defining A/B testing 

    • Conducting an A/B test 

      • Planning the experiment  

      • Framing the statistics 

      • Building the experiment 

        • Looking at options to build the site 

      • Implementing A/B testing on the server 

        • Understanding the scaffolded site  

      • Building the test site  

      • Implementing A/B testing 

      • Viewing the results 

        • Looking at A/B testing as a user 

      • Analyzing the results 

        • Understanding the t-test 

      • Testing the results 

    • Summary 

  9. Analyzing Social Data Participation

    • Setting up the project 

      • Understanding the analyses 

      • Understanding social network data 

      • Understanding knowledge-based social networks  

      • Introducing the 80/20 rule  

        • Getting the data 

        • Looking at the amount of data 

        • Defining and loading the data 

        • Counting frequencies 

        • Sorting and ranking  

        • Finding the patterns of participation  

      • Matching the 80/20 rule 

      • Looking for the 20 percent of questioners 

      • Looking for the 20 percent of respondents 

      • Combining ranks 

        • Looking at those who only post questions 

        • Looking at those who only post answers  

        • Looking at those who post both questions and answers 

      • Finding the up-voted answers 

      • Processing the answers 

        • Predicting the accepted answer  

      • Setting up 

        • Creating the InstanceList object 

      • Training sets and Test sets  

        • Training  

        • Testing 

      • Evaluating the outcome 

    • Summary 

  10. Modeling Stock Data

    • Learning about financial data analysis 

    • Setting up the basics 

      • Setting up the library 

      • Getting the data 

    • Getting prepared with data 

      • Working with news articles 

      • Working with stock data 

    • Analyzing the text 

      • Analyzing vocabulary 

      • Stop lists 

      • Hapax and Dis Legomena 

      • TF-IDF 

    • Inspecting the stock prices  

    • Merging text and stock features 

    • Analyzing both text and stock features together with neural nets 

      • Understanding neural nets  

      • Setting up the neural net  

      • Training the neural net  

      • Running the neural net 

      • Validating the neural net  

      • Finding the best parameters 

    • Predicting the future  

      • Loading stock prices 

      • Loading news articles  

      • Creating training and test sets 

      • Finding the best parameters for the neural network 

      • Training and validating the neural network 

      • Running the network on new data  

    • Taking it with a grain of salt 

      • Related to this project 

      • Related to machine learning and market modeling in general  

    • Summary​

Index  

3.40909
Average: 3.4 (209 votes)

Search the Web

Custom Search

Searches whole web. Use the search in the right sidebar to search only within javajee.com!!!