Datasets / Sites | Tables | Columns | Rows | Data Bytes | Media Bytes | Media Items |
---|---|---|---|---|---|---|
2,903 (+ 3)
|
10,722 (+ 4)
|
180,656 (+ 153)
|
1,663,842,947 (+ 33,569,451)
|
463.49G (+ 4.51G)
|
1.47T (+ 98.81G)
|
27,128,512 (+ 1,987,187)
|
Datasets / Sites | 2,903 (+ 3)
|
---|---|
Tables | 10,722 (+ 4)
|
Columns | 180,656 (+ 153)
|
Rows | 1,663,842,947 (+ 33,569,451)
|
Data Bytes | 463.49G (+ 4.51G)
|
Media Bytes | 1.47T (+ 98.81G)
|
Media Items | 27,128,512 (+ 1,987,187)
|
WHAT IS DATASN?
DataSN, or Data Source Network, crawls, parses and hosts all data of the Internet, not raw web pages, but data objects that are both machine friendly and human readable. More than website scraper, DataSN extracts, cleanse, normalize, categorize, and format data.
Every table row is stamped with the time it's created or last updated. You can easily find the newly created or updated data rows since your last retrieval. Some datasets have history traceback enabled with all historical values archived for a given field.
DataSN data sets are rigorously sanitized and cleansed, ready for software consumption. We frown upon unparsed strings and raw bytes, providing atomic data values that are immediately usable by the simplest of programs.
DataSN columns and tables are properly named after the meaning and semantic nature of the data so you instantly know what the data is about.
Data are instantly published via API as soon as they are crawled so your program knows what happens in real world by the minute.
Data should be formatless rather than be bound to a specific application. DataSN data is neutral in formats not affiliated with any proprietary application by delivering the same piece of data in all formats you can imagine, among which the most popular being JSON, XML, CSV, Excel, and HTML. Advanced formats are available per request, such as MySQL, MSSQL, etc.
All DataSN data records are properly related to / associated with each other to form a traversable family / network of relations or knowledge map. Data are structurally normalized to reduce redundancy, to facilitate association, categorization, and searching.
DataSN crawls not just text but also all the media. Media files, such as images, are exhaustively collected, meticulously tagged, categorized and associated with its particular data row(s) so they are searchable and retrievable by information about them.
but only up to 2021