Frequently Asked Questions

Work-in-progress. Eventually I will find the time to write about our data collection and analysis methodology, details on our ML model for classification (niche, language, demographic). For now send your question to [email protected]

How accurate is your demographic data

The demographic data is fairly accurate when viewed in aggregation, and less accurate when viewed individually. For example, the age and gender data are very accurate for fashion influencers as a group. Inaccuracy does appear for specific influencer ~5% of the time. Mostly due to influencers having more than 2 major niches or languages (for example, an influencer that posts about travel and tech in 2 or 3 languages). That affects her followers distribution, which in turns affect the demographic calculation.

To be more technical, demographic data are calculated using machine learning model with the following inputs:

Gender: Gender is identified by analyzing followers real name [1] and profile image [2]. Androgynous label is given to names like Sam.

Age: Age is identified by analyzing followers profile photo and their 5 most recent instagram photos (think HOW OLD DO I LOOK photo app) [2] [3]

Country: We determine influencers language and country using their profile description and post text. [4]

Niche: We use ResNet50 [5] to extract features from influencers posts and calculate their word distance to a predefined classes of niche. For example, a food influencer will most likely have features like sardines, apple and steak extracted from their posts and these features are closer to 'food' than 'fitness'. [6]

How many influencers are you tracking at the moment?

Roughly 1.2 million.

I am an influencer. Why am I not on your site?

We only track influencers with more than 10k followers at the moment. Feel free to send me an email at [email protected] for username submission.

How do you calculate engagement rate

$$Engagement Rate = {Likes + Comments \over Followers}.$$

[1] Zed Gecko gender-verification by forename
[2] IEEE Workshop on Analysis and Modeling of Faces and Gestures (AMFG), at the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Boston, 2015
[3] Rude Carnie: Age and Gender Deep Learning with TensorFlow
[4] Fast and accurate language identification using fastText
[5] ResNet-50 Pre-trained Model for Keras
[6] Vector Representations of Words