About The Glossary
We are committed to writing all our documents in plain and readeable language. There are some terms used throughout this report that may be unfamiliar or industry-specific. They are defined below, along with locations in the document where they are mentioned.
Data enrichment
The act of augmenting existing dataset by adding relevant data (like adding census data to street paving info).
Data filtering
The act of excluding certain rows or observations or data according to a rule. For example, excluding all Get It Done requests that were related to a pothole (which we’re not doing).
ETL
Extract Transform Load; An automated process for moving data from one location to another.
Fuzzy joins
Special joins that can match two strings even when they are not exactly equal but are close. For example, we would like to match the query 'mad max fury road' with the real movie title 'Mad Max: Fury Road'. There are several techniques for doing this, and they are all time-consuming.
GIS
A geographic information system (GIS) is a system designed to capture, store, manipulate, analyze, manage, and present all types of spatial or geographical data.
Git
Git is a free and open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency.
GitBook
The toolchain (GitBook) is a tool for building beautiful books using Git and Markdown. It can generate your book in many formats: PDF, ePub, mobi or as a website.
Hackathon
An event, typically lasting several days, in which a large number of people meet to engage in collaborative computer programming, usually as a competition.
Metadata
A set of data that describes and gives information about other data.
Open Format
Data in an open format means in a convenient and modifiable form, such that there are no unnecessary technological obstacles to its use. Specifically, data should be machine-readable, available in bulk, and provided in a format with a freely available published specification which places no restrictions, monetary or otherwise, upon its use (http:\/\/opendefinition.org\/ofd\/).
Open San Diego
Group of San Diego civic-stakeholders who meet regularly to share knowledge and collaborate on projects to make our region a better place to live, work and play. OpenSanDiego.org
Open Source
Open source software is software that can be freely used, changed, and shared (in modified or unmodified form) by anyone. Open source software is made by many people, and distributed under licenses that comply with the Open Source Definition.
Relational database
This type of database organizes data among various tables that can be related together by unique keys that are assigned to rows.
System of Record
The software system that is the authoritative datasource for a given data element or piece of information.