The algorithm doesn't matter...

Musk can open-source the algo, but the data is what matters.

May 04, 2022

I’m an IT guy. I have my specialist area, but generally I have an appreciation of how things work. I’m also pretty experienced with spotting rhetorical legerdemain.

So when Elon Musk says he will open-source the Twitter algorithm, here’s a grossly simplified explanation of my view that publishing the algo doesn’t matter at all without all of the accompanying data being shared.

The algorithm is a machine. It takes input and it produces a result. You can open-source that all day long. You can publish explainers for the non-cognoscenti. Draw diagrams. Make memes. It does not matter one iota.

You can publish the blueprints of a combustion engine, but unless you know what the fuel is and what it’s supposed to drive, you cannot understand the end-to-end effect.

The moderating algo will basically do this: Identify things that are similar to stuff on list A or similar to stuff on list B. Promote things that correspond with list A, demote things that correspond with list B and send anything else to a human moderator. Another algo will observe and infer from what the human moderator does, and add the learnings to list A and/or B.

List A and list B are incredibly valuable commercial products of other machine-learning algorithms that are trained by the responses of the Twitter moderators. I guarantee these lists (or databases or matrices or whatever) will never be made public unless they are leaked by a true renegade, or some hacking takes place.

And they could publish the algo that builds list A and list B, but still the training data (input from human moderator actions) and the output (learnings and subsequent logic from that) will never be published.

And by the way, it can be reasonably assumed that even if that output was published, it would be completely beyond interpretation. Machine learning algorithms very quickly build networks of inference that are impossible for humans to reasonably unpick. Using AI/ML is quite literally handing control over to logical entities that are beyond our understanding. Go and watch some videos on YouTube - they will confirm this.

So Elon Musk could buy Twitter - I will believe it when the deal is complete - and he could make some or all of the algorithms open-source, but it will not mean a goddam thing and it will not change a goddam thing. Not without a lot of other actions, such as retraining the AI/ML engines with new training data. That means - at the very least - getting control of the human moderators and scrapping a lot (probably all) of existing input to the moderation engines.

So, enjoy the memes and the smoke and the mirrors, but do not expect anything at all to change or any curtains to be pulled back.

Plotloss

May 4, 2022

For a decent example of this I got a 30 day Facebook ban for saying I 'forgot the fucking crackers' a couple of Christmases ago. To a Brit that means being absent minded over a key component of Xmas dinner. To an American algorithm it's a racial epithet used to describe people of white Northern European descent usually by African Americans...

Expand full comment

1 reply by Al Jahom

1 more comment...

Al Jahom's Final Word

Discussion about this post