Data Science Articles List
Great Python Books
  • Recommended
  • developer_mode Python Cookbook

    By Brian K.Jones & Beazley

  • developer_mode Python For Data Analysis

    By Wez Mckinney

  • developer_mode Python Data Science HandBook

    By Jake VanderPlas

  • developer_mode Python Machine Learning

    By Sebastian Raschka

  • developer_mode Hands On Machine Learning with Scikit-Learn & TensorFlow:

    By Aurelien Geron

  • developer_mode Mastering ML with Scikitlearn

    By Gavin Hackeling

  • developer_mode Monetizing ML

    By Manual Amunategui

  • developer_mode Building Machine Learning Systems With Python

    By W. Richert

  • developer_mode Learn Python The Hard Way

    By Zed A. Shaw

  • developer_mode Python Crash Course:Hands on

    By Eric Matthes

  • developer_mode Fluent Python

    By Luciano Ramalho

Dataset Hub A

  • List of Datasets
  • Awesome Public Datasets on github, curated by caesar0301.
  • AWS (Amazon Web Services) Public Data Sets, provides a centralized repository of public data sets that can be seamlessly integrated into AWS cloud-based applications.
  • Anacode Chinese Web Datastore: a collection of crawled Chinese news and blogs in JSON format.
  • AssetMacro, historical data of Macroeconomic Indicators and Market Data.
  • BigML big list of public data sources.
  • Bioassay data, described in Virtual screening of bioassay data, by Amanda Schierz, J. of Cheminformatics, with 21 Bioassay datasets (Active / Inactive compounds) available for download.
  • Bitly data, anonymized clicks on gov links.
  • Canada Open Data, pilot project with many government and geospatial datasets.
  • Causality Workbench data repository.
  • Corral Big Data repository at Texas Advanced Computing Center, supporting data-centric science.
  • Credit Risk Analytics Data: a home equity loans credit data set, mortgage loan level data set, Loss Given Default (LGD) data set and corporate ratings data set.
  • CrowdFlower Data for Everyone library.
  • Data Source Handbook, A Guide to Public Data, by Pete Warden, O'Reilly (Jan 2011).
  •, open government data from US, EU, Canada, CKAN, and more.
  •, publicly available data from UK (also London datastore.)
  •, central guide for education data resources including high-value data sets, data visualization tools, resources for the classroom, applications created from open data and more.
  • DataMarket, visualize the world's economy, societies, nature, and industries, with 100 million time series from UN, World Bank, Eurostat and other important data providers.
  • Datamob, public data put to good use.
  • Data Planet, The largest repository of standardized and structured statistical data, with over 25 billion data points, 4.3 billion datasets, 400+ source databases.
  •, datasets for data geeks, find and share Machine Learning datasets.
  •, a clearinghouse of datasets available from the City & County of San Francisco, CA.
  • DataFerrett, a data mining tool that accesses and manipulates TheDataWeb, a collection of many on-line US Government datasets.
  • Delve, Data for Evaluating Learning in Valid Experiments
  • EconData, thousands of economic time series, produced by a number of US Government agencies.
  •, discover and share cool data, connect with interesting people, and work together to solve problems faster.
  • Enron Email Dataset, data from about 150 users, mostly senior management of Enron.
  • Europeana Data, contains open metadata on 20 million texts, images, videos and sounds gathered by Europeana - the trusted and comprehensive resource for European cultural heritage content.
  • FEDSTATS, a comprehensive source of US statistics and more
  • FIMI repository for frequent itemset mining, implementations and datasets.
  • Financial Data Finder at OSU, a large catalog of financial data sets.
  • GDELT: The Global Data on Events, Location and Tone, described by Guardian as "a big data history of life, the universe and everything."
  • GEO (GEO Gene Expression Omnibus), a gene expression/molecular abundance repository supporting MIAME compliant data submissions, and a curated, online resource for gene expression data browsing, query and retrieval.
  • GeoDa Center, geographical and spatial data.
  • Google ngrams datasets, text from millions of books scanned by Google.
  • Grain Market Research, financial data including stocks, futures, etc.
  • HitCompanies Datasets, comprehensive data on random 10,000 UK companies sampled from HitCompanies, updated automatically using AI/Machine Learning.
  • ICWSM-2009 dataset contains 44 million blog posts made between August 1st and October 1st, 2008.
  • Infochimps, an open catalog and marketplace for data. You can share, sell, curate, and download data about anything and everything.
  • Investor Links, includes financial data
Dataset Hub B


Useful Resources for Data Science

Follow JCharisTech

Follow us on social media for special offers.