lecture1 Note

What is Data?

  1. Data information that is pocessable by Computer

    • Information is in ditial form

    • A program can read and analyze the information

Accessible Data

  1. Right to access to the data

  2. Procedure to obtain or use the data

    • Option 1: Locally
    • Option 2: Remotely

API

  • An API is a doucemnt that specifies what queries are possible and their format, and the format of the responses.

## Licenses for Data

Followings are details of different licenses:

CC0 Attribution Attribution ShareAlike Attribution No Derivatives Attribution Noncommerical Attribution Noncommerical ShareAlike Attribution Noncommerical No Derivatives
No restriction Need to cite the original owner The dataset cannot change any more Other users should make attribution to you after creating a dataset based on your dataset No commercial purpose No commercial purpose & Need to cite the original owner No commercial purpose & Other users should make arrtibution to you after creating a dataset based on your dataset

Type of Data

  • Tabular Data
  • Time Series Data
  • Geospatial Data
  • Network Data
  • Text Data
  • Multimedia Data

Common Data Terms

  • Privacy and Sensitive Data
  • Metadata – the data about data
  • Unstructured & Structured Data (eg Using XML Markup to make data structured )
  • Databases
  • Data Silos
  • Data “Born Digital”
  • Data Quality (usually not good and need pre-process)
  • Data Cleaning
  • Raw Data (Newly collected data before cleaning or pre-processing)
  • Open Data
  • Big Data
    - Volume: very large quantities
    - Variety: very diverse in nature
    - Velocity: very fast collection rates