whosonfirst-brands

Brands in Who's On First documents.

Caveats

This is a work in progress and very much still "wet paint" and there is little to no tooling for this stuff yet.

Where do all these #brands come from?

At the moment, they come from the Elasticsearch index running the Who's On First Spelunker. They are the product of a not very sophisticated faceting process on an unanalyzed copy of the wof:name field (called unsuprisingly name_not_analyzed). Like this:

curl -s -v --max-time 600 'http://localhost:9200/spelunker/_search?from=0&size=50' -d '{"query": {"term": {"w:placetype": "venue"}}, "aggregations": {"brands": {"terms": {"field": "name_not_analyzed", "size": 0}}}, "size": 0}' > brands.json

That produces something like 16 million distinct names. We have not imported most of those. Instead we have limited the #brands included here to only those with 50 (or more) venues. So instead of 16 million #brands we have about 7,400 as of this writing. Maybe the cut-off point should be 25, maybe it should be 10. Maybe it should be 5. We don't know yet. We're figuring it out as we go.

It is assumed that a whole bunch of these records will be superseded or deprecated or both. That work remains tomorrow's problem.

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
data		data
sizes		sizes
utils		utils
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE.md		LICENSE.md
Makefile		Makefile
README.md		README.md
mk-utils.sh		mk-utils.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

whosonfirst-brands

Caveats

Where do all these #brands come from?

About

Releases

Packages

Contributors 3

Languages

License

whosonfirst-data/whosonfirst-brands

Folders and files

Latest commit

History

Repository files navigation

whosonfirst-brands

Caveats

Where do all these #brands come from?

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages