A Brief History of Predictive Analytics – Part 2

This is part 1 of a 3-part blog, where we outline the history of Predictive Analytics.  Given it is such a hot topic with our warranty manufacturing clients – as well as just about any company looking to drive higher efficiency, profitability and customer satisfaction – we thought a quick history lesson might be warranted (pun intended).  Part 1 takes you from the 1940’s – 1950’s, Part 2 from the 1960’s – 1990’s, and Part 3 from the 2000’s – today.

The 1960’s: IBM and Database Management Systems

  • Key Players:
    • Large Organizations (e.g. IBM, who had the need, the resources and the capital to invest in developing database systems)
  • Significant Events:
    • IBM invented the floppy disk in 1967 and was the first to sell “disk storage” which allowed data to be accessed directly and shared between computers.
    • Online processing began thanks to ability to share data – claims processing, bank teller processing, airline reservation processing, retail point of sale processing.
  • Innovations:
    • Floppy disks – 8 inches long and were basically uncovered magnetic disks sealed in hard plastic; by the mid-70’s floppy disks became the most widely used form of portable storage; could hold up to 8 formatted .doc files.
    • Database management systems (DBMS) – A database management system (DBMS) is a collection of programs that enables you to store, modify, and extract information from a database.

The 1970’s – 1980’s: Relational Databases, Data Warehousing, & Decision Support Systems

  • Key Players:
    • Corporations, Startups, Oracle, Apple, Microsoft
  • Significant Events:
    • In 1970, E.F. Codd, an English computer scientist who worked for IBM, invented the relational model for database management, the theoretical basis for relational databases and relational database management systems.
    • Black-Scholes model was developed in 1973 to predict the optimal stock options price over time.
    • SAS (Statistical Analysis System) Institute became a private company in 1976 – began as a project in NC State’s Agriculture department to determine the effect of seed variety/weather/soil on crop yields.
    • Personal computers and spreadsheets were launched (Lotus 1-2-3 then Microsoft Excel) and removed the need for an “Information Center” within an organization.
    • The AAAI (Association for the Advancement of Artificial Intelligence) was founded in 1979 and started releasing its quarterly magazine – AI Magazine – in 1980 (currently has over 4,000 members).
    • Oracle was the first to commercialize Relational Database Technology in 1979 – became the dominant form of computer storage in the digital economy.
    • IBM developed the first “business data warehouse” in the late ‘80’s – a model for the flow of information from operational systems to decision support environments.
    • Apple launches the Mac in 1984.
    • Microsoft launches Windows in 1985.
  • Innovations:
    • Relational database – a set of tables from which data can be accessed or reassembled in many different ways without having to reorganize the database tables. The standard user and application programming interface (API) of a relational database is the Structured Query Language (SQL). Relational databases are what most companies use today for data management.

Figure 1 – Relational Database

  • Data warehouse – data that had previously been spread across numerous sources could now be held in one place with one querying tool, drastically cutting down the time it took to access data for analysis.
  • Decision support systems (DSS) – an interactive software-based system intended to help decision makers compile useful information from a combination of raw data, documents, and business models to identify/solve problems and make decisions.
  • SQL (Structured Query Language) – allowed users to ‘Insert’, ‘Update’, ‘Delete’, ‘Create’, ‘Drop’ table records. Queries could be written to extract data from many tables at once, helping companies to access and store their data.
  • Extract, Transform and Load (ETL) tools – technologies that allowed businesses to move data from disparate sources, transform it into one data format, and load it into the data warehouse.

Note: The inventions in the 1970’s-1980’s provided the foundation necessary to propel the data analytics industry forward into the 20th century.

The 1990’s: Online Search and Personalization

  • Significant Events:
    • Google, Amazon, eBay
  • Significant Events:
    • Database Marketing becomes cover story in Business Week in September 1994.
    • Amazon and eBay go live in 1995 and accelerate the pace of online personalization.
    • In 1998, Google applies algorithms to web searches to maximize results relevance.
    • The terms “big data”, data science” and “business intelligence” are coined.
    • Dozens of business intelligence vendors hit the market offering two primary types of data analytics – dashboards/reporting, and organization/visualization, but consumption is limited to technical experts.
  • Innovations:
    • Database marketing – the use of large databases of information to develop models to target potential customers based on likelihood of purchase.

Figure 2 – Database Marketing

 

  • “Data marts” – compressed versions of data warehouses allowing business units within an organization to access information according to their needs.
  • OLAP – Online Analytical Processing – database technology that has been optimized for querying and reporting, instead of processing transactions. It provides the ability to execute multidimensional queries quickly and easily.
  • Data mining (also known as Knowledge Discovery in Data (KDD) – process used to extract usable data from a large sets of raw data; it involves data collection, warehousing, computer processing, pattern recognition based on trend and behavior analysis, and prediction based on likely outcomes.
  • Data visualization – the presentation of data in a graphical or pictorial format, enabling decision makers to: comprehend information quickly, draw conclusions, identify relationships and patterns in their data, pinpoint emerging trends and disseminate insights throughout organizations.