Skip to content

ETL to QE, Update 13, Redefining Project Scope

Date: 2023-10-19

See Discord Binding for project context

It has become clear producing these graph's from the discord data does not produce very meaningful intelligence. Part of the problem is that I am doing a very good job communicating my results and what they represent but another side of the problem is that the current analysis of the data only hints at the top of the ice burg.

Reviewing Roadmap - Dentropy Daemon

As a reminder the plan has 4 steps,

  1. Discord Analytics Reports and Dashboard
  2. Graph Based Annotation on Top of Discord Data
  3. Allow for Generalized Questioning and add Additional Data Sources
  4. Proof of Meme Micro Bounty Platform

I am still stuck on the second part of step one the dashboard. Through the process of producing these updates I have learned the importance of clearly articulating, interrogating, what I want to build. Simple questions like, who is the dashboard for, what problem does it solve, and what are the different component parts should all be answered. I will come back to this new intuition later.

Namespace Knowledge Sachems

Today while reviewing Roadmap - Dentropy Daemon I came up with a vision for step 2, Graph Based Annotation on Top of Discord Data. I realize that just labelling stuff all willy nilly is not going to get me anywhere. Different people are going to use the same labels for different things and a lot of context if to understand exactly what a label represents.

This problem got me thinking. What if we use common datasets that everyone is familiar with and have them accessible to the user as separate name spaces. For example when someone mentions a crypto currency in a discord message that message can be linked to the wikipedia article of said crypto currency. All of english wikipedia has only about 6 million articles, storing only the names of the articles in a database will not be difficult. Similar name spaces could be claimed for sites such as IMDB, Goodreads, and even Steam - Software. I documented a list of different types of data and datasets in Namespace Knowledge Schemas.

HPI - Human Programmable Interface

I came up with an interesting question to ask myself, What kind of interface would someone use to understand and program me? At this point I have been talking for years about putting all data an individual uses into a single sovereign API that the individual can control. It is a nice talking point but it is too abstract to be actioned upon. There is no form to this idea. But when asking What kind of interface would someone use to understand and program me? I am presented with a constraint that forces the abstract idea of a sovereign API for all data an individual has ever generated to have a form.

I now have to ask myself, if someone had access to all my emails what would they look for in order to get to know me? What kind of analysis on all my direct messages could produce meaningful insights? What does ones location history say about a person? All this data is useless without the questions that are required to give it context. I should be spending more time and energy trying to ask Concepts/List/Clear Questions of Value rather than trying to produce pretty graphs.