Schematix

An explosive normalization tool

A Quick Introduction

Schematix is a tool for those who face the problem of Data Wrangling: the art, or science, or conforming your data to a desired format. Data analysis tools are extremely high in demand. However, one known issue with data analysis is an inconsistency in formatting between different data sources. Is my data compatible with the analysis tool I’m trying to use? This problem is only exacerbated when data is represented inconsistently between various mediums. Take for example a team of two, each trying to represent the same structured data. Let’s say that one person decided to create an Excel spreadsheet while the other wrote a Python script to populate a CSV file. What if they chose inconsistent names of attributes for enumerated types (i.e. “female” vs. “f” vs. “Female”)? What if one person decided to leave some cells blank, while the other didn’t? What if the person entering into Excel had typos that were corrupting various aggregation results? Suddenly, the team’s simple task of trying to merge and aggregate their data with some analysis tool becomes much more complicated.

Schematic Fission enables you to transform spreadsheet data into a database schema by inferring functional dependencies, turning them into SQL foreign keys.