# other.codes — Public Dataset

This folder contains downloadable outputs from the other.codes human-made mark
analysis project. Files are generated automatically from the analysis pipeline
and updated as new marks are collected and processed.

## Visualisations

### pca_vector.svg
A PCA (Principal Component Analysis) scatter plot placing each mark in a
two-dimensional style space derived from morphological measurements of its
shape, stroke character, and topology. marks that appear close together are
more similar in measured style. Each data point is rendered as the actual
vector trace of that mark rather than an abstract marker.

### dendrogram.svg
A hierarchical clustering dendrogram showing the similarity structure of the
full mark collection. marks joined by shorter branches are more similar to each
other. The tree is computed using Ward linkage (Euclidean distance) on the
same 17 standardised features as the PCA. Coloured strips below the glyph
row indicate the three broadest style families identified by the analysis.

## Downloads

### vectors.zip
A ZIP archive containing one SVG vector trace per mark. Each SVG is a clean
black-stroke vector generated by tracing the binary segmentation mask of the
original photograph using vtracer (spline mode). The traces are scale-
normalised — they represent letterform shape, not physical size.

### metrics.csv
A table of morphological measurements, one row per mark. Covers shape (area,
aspect ratio, compactness, solidity), stroke width (mean, variation), skeleton
topology (branching, loops, endpoints), and vector complexity (path count,
closed-path ratio). See METHODS.md for a full description of every column.

### METHODS.md
Full description of the analysis pipeline: how marks are segmented, vectorised,
measured, and compared. Includes all metric definitions, preprocessing steps,
PCA settings, and clustering parameters.

### pca_interpretation.md
A plain-language interpretation of the two principal components shown in
pca_vector.svg — which morphological features drive each axis and what they
reveal about style. Generated automatically from the loadings of the current
dataset; updates whenever new marks are added.

## Dataset notes

This public dataset contains a small, geographically limited collection of
photographed marks. Results should be treated as exploratory
rather than definitive. Analysis code is available at
[github.com/untoldlabs/otherCodes](https://github.com/untoldlabs/otherCodes).

## Citation

> other.codes human-made mark dataset, Untold Labs (https://other.codes), 2026.

[other.codes](https://other.codes) — built by Untold Labs.
