ciro-santilli-s-projects.bigb
= Ciro Santilli's projects
= Projects
{synonym}
Major projects can be seen at: <the most important projects done by Ciro Santilli>{full}.
A summary of minor projects is given at: <Ciro Santilli's minor projects>.
This section is a dump for anything else, to keep those sacred first sections that show on the top of the homepage clean.
= OurBigBook
{c}
{parent=Ciro Santilli's projects}
= OurBigBook Project
{synonym}
{title2}
https://docs.ourbigbook.com/
\Image[https://raw.githubusercontent.com/ourbigbook/ourbigbook/master/logo.svg]
{title=Logo of the <OurBigBook Project>}
= OurBigBook Markup
{c}
{parent=OurBigBook}
{tag=Lightweight markup language}
{tag=Personal knowledge base}
The <markup language> of <OurBigBook.com>.
Also used on <Ciro Santilli's website> as a <static website> via the <OurBigBook CLI>.
The one <markup language> to rule them all?
Documentation at: https://docs.ourbigbook.com[].
= OurBigBook CLI
{c}
{parent=OurBigBook}
{tag=Static site generator}
Official <Command-line interface> to convert a directory of <OurBigBook Markup> files into a <static website>. See also: https://cirosantilli.com/ourbigbook/ourbigbook-cli
= OurBigBook Library
{c}
{parent=OurBigBook}
Base <JavaScript> library that implements the <OurBigBook Markup>. Use by both:
* <OurBigBook CLI>
* <OurBigBook Web>
= OurBigBook Web
{c}
{parent=OurBigBook}
The website system that runs <OurBigBook.com>. For further information see:
* <OurBigBook.com>: rationale
* https://cirosantilli.com/ourbigbook/ourbigbook-web[]: project documentation
Relies on the <OurBigBook Library> to compile <OurBigBook Markup>.
\Include[ourbigbook-com]
= OurBigBook feature
{c}
{parent=OurBigBook}
= OurBigBook topic feature
{c}
{parent=OurBigBook feature}
= OurBigBook topics feature
{synonym}
More info at: https://docs.ourbigbook.com#ourbigbook-web-topics
= OurBigBook dynamic tree
{c}
{parent=OurBigBook feature}
More info at: https://docs.ourbigbook.com/ourbigbook-web-dynamic-article-tree
= x86 bare metal examples
{c}
{parent=Ciro Santilli's projects}
{splitSuffix}
https://github.com/cirosantilli/x86-bare-metal-examples
As mentioned at <Linux Kernel Module Cheat>{full}, this should be merged into that other project.
= Ciro Santilli's naughty projects
{c}
{parent=Ciro Santilli's projects}
If <Ciro Santilli> weren't a <Ciro Santilli's campaign for freedom of speech in China>[natural born activist], he chould have made an excellent <intelligence analyst>! See also: <Being naughty and creative are correlated>{full}.
* <Stack Overflow Vote Fraud Script>
* <GitHub> makes Ciro feel especially naughty:
* <All GitHub Commit Emails>: he extracted (almost) all Git commit emails from <GitHub> with <Google BigQuery>
* https://github.com/cirosantilli/test-many-commits-1m/[A repository with 1 million commits]: likely the https://www.quora.com/Which-GitHub-repo-has-the-most-commits/answer/Ciro-SantilliI[live repo with the most commits as of 2017]
* https://stackoverflow.com/questions/20099235/who-is-the-user-with-the-longest-streak-on-github/27742165#27742165[An 100 year GitHub streak], likely longest ever when that existed. It was consuming too much <server> resources however, which led to GitHub admins manually https://web.archive.org/web/20151021135921/https://github.com/cirosantilli/[turning off his contribution history].
* https://github.com/cirosantilli/test-octopus-100k[A repository with a 100k commit Git octopus merge]. Now that is a true https://softwareengineering.stackexchange.com/questions/314215/can-a-git-commit-have-more-than-2-parents/377903#377903[Cthulhu merge].
* https://github.com/isaacs/github/issues/1718[500 on adoc infinite header xref recursion]: that was fun while it lasted
Outside this website:
* https://cirosantilli.com/china-dictatorship/zhihu-censorship-of-hao-haidong
= All GitHub Commit Emails
{c}
{parent=Ciro Santilli's naughty projects}
{tag=Open-source intelligence}
{tag=Ciro Santilli's data projects}
https://github.com/cirosantilli/all-github-commit-emails
In this project <Ciro Santilli> extracted (almost) all Git commit emails from <GitHub> with <Google BigQuery>! The repo was later taken down by <GitHub>. Newbs, censoring publicly available data!
Ciro also created a beautifully named variant with one email per commit: https://github.com/cirosantilli/imagine-all-the-people[]. True art. It also had the effect of breaking this "what's my first commit tracker": https://twitter.com/NachoSoto/status/1761873362706698469
\Image[https://raw.githubusercontent.com/cirosantilli/media/master/GitHub_Archive_Google_bigquery_PushEvent_email_highlight.png]
{height=810}
{title=<#GitHub Archive> query showing hashed emails}
{description=It was <Ciro Santilli> that made them hash the emails. They weren't hashed before he published the emails publicly.}
\Image[https://raw.githubusercontent.com/cirosantilli/media/master/All_GitHub_commit_emails_repo_screenshot_before_takedown_archive_is.png]
{height=768}
{title=<All GitHub Commit Emails> repo before takedown}
{description=Screenshot from <archive.is>.}
= Facebook profile face dump
{c}
{parent=Ciro Santilli's naughty projects}
{tag=Ciro Santilli's data projects}
In 2016 Ciro made a script downloaded <Facebook> profile pictures.
This was possible at the time without any login by using a 2010 profile ID dump from originally announced at: https://blog.skullsecurity.org/2010/return-of-the-facebook-snatchers since profile picture access was not authenticated.
The profile ID dump was downloadable through a <BitTorrent> named `fbdata.torrent` of about 2.8GB, mostly compressed. Doing:
``
find . -type f | xargs sha256sum | sha256sum
``
on Ubuntu 20.04 gives:
``
2c9a739c9c5495e38ebab81fc67411b7c6562f139dcb8619901a3f01230efdd5
``
This dump widely reported e.g. on <Hacker News> at: https://news.ycombinator.com/item?id=1554558[].
At some point however, Facebook finally started to require tokens to view public profile pictures, thus making such further collection impossible, e.g. as of 2021: https://developers.facebook.com/docs/graph-api/reference/v9.0/user/picture[] mentions:
\Q[Querying a User ID (UID) now requires an access token.]
This is also mentioned e.g. at: https://stackoverflow.com/questions/11442442/get-user-profile-picture-by-id[]. This major privacy flaw was therefore finally addressed at some point, making it impossible to reproduce this project.
Ciro downloaded 10 thousand of those pictures, and did facial extraction with: https://stackoverflow.com/questions/13211745/detect-face-then-autocrop-pictures/37501314#37501314
He then created single a video by joining 10 thousand of those cropped faces which can be uploaded e.g. to <YouTube>. Ciro later decided it was better to make those videos private however, as sooner later he'd lose his account for it.
<Companies> like <YouTube> blocking this kind of content is the type of thing that makes companies take longer to fix such gaping privacy issues, and is a bit like <security through obscurity>. A video makes it clear to everyone that there is a privacy issue very effectively. But people prefer to hide and look away, and then 99% of people who know nothing about tech get their privacy busted by actual criminals/government spies and never learn about it.
But now that Facebook finally fixed it, it's fine, no need for the video anymore.
= Ciro Santilli's data projects
{parent=Ciro Santilli's projects}
<Ciro Santilli> has enjoyed doing projects dealing with with lots of data! They usually have a large overlap with <Ciro Santilli's naughty projects>, but not always!
= Wikipedia CatTree
{c}
{parent=Ciro Santilli's data projects}
{splitSuffix}
{tag=Ciro Santilli's minor projects}
This mini-project walks the category hierarchy <Wikipedia dumps> and dumps them in various simple formats, <HTML> being the most interesting!
* <HTML> dumps: https://cirosantilli.com/wikipedia-cattree/
* methodology: https://stackoverflow.com/questions/17432254/wikipedia-category-hierarchy-from-dumps/77313490#77313490
Scripts used:
* \a[wikipedia/import-sqlite.sh]
* \a[wikipedia/sqlite_preorder.py]
* \a[wikipedia/wikipedia-cattree.sh]
\Image[https://raw.githubusercontent.com/cirosantilli/media/master//Wikipedia_CatTree.png]
{title=<Mathematics> dump of <Wikipedia CatTree>}
{source=https://cirosantilli.com/wikipedia-cattree/Mathematics}
\Include[ciro-santilli-s-open-source-contributions]{parent=Ciro Santilli's projects}