Subsequently, we take a sample of successful companies to identify patterns within their founders’ character trades that can predict the success of the respective companies. We assume that big5 character trades of founders do not (or only slightly) change over time. Hereof, we use a model to predict the big5 character trades of the respective founders. TowardsDataScienceįor example, we collect the social media profiles of founders. The goal of this technique is to speed up data transformations while also enhancing model accuracy. While creativity with respect to identification sources is fairly easy to copy, creativity with respect to feature engineering is significantly more complex and requires proper data extraction, data cleaning and novel approaches for feature creation.įeature engineering is a machine learning technique that leverages raw data and existing variables/features to create new variables/features that aren’t in the training set. In turn, this helps us to improve screening models and identify the best opportunities earlier than others.Īs said earlier in episode#6 and episode #7, creativity with respect to feature engineering becomes the ultimate differentiator. This proprietary information can then be leveraged to verify and complement publicly available data (as per episode#3 startup database benchmarking and episode#4 how to scrape alternative data sources) to eventually receive a more balanced and well-rounded picture of every company. Thankfully, at Earlybird we have granular data on 25 years+ of transactions and founder interactions. The availability of proprietary data such as pitch decks, financial information, notes from founder interactions etc will become increasingly important. It certainly kept us ahead for some time, but is really no rocket science to replicate once the secret - in this case just a website/source and a search term - is out. In many cases, this approach helped us identify new startups even before they got officially registered. For example, one or two years back it was highly innovative to leverage LinkedIn sales navigator to search for and crawl profiles of professionals who changed their title from whatever to “Starting something new” or “Stealth mode”. Commercial data providers become increasingly redundant and a “creativity race” has been started to be among the first to find novel identification sources. Mid-term: Creativity with respect to identification sources becomes key. Short-term: If you do something, you’re already ahead of the majority. What is the secret sauce in data-driven VC? For everyone else, those who are in the position to build something in-house, the majority of content is probably “Hey, that’s exactly what we do”! It eventually informs their decision to start becoming more data-driven and helps them avoid stupid mistakes. As a result, the majority of readers are actually not in a position to act upon most of the shared content but rather perceive it as an inspiration and a rough idea of what is possible. Very few funds actually have the means (=management fee), the commitment in the leadership, someone who bridges the gap between the investment and the engineering world as well as access to world-class engineering talent to really do something innovative with data. Their approach to data-driven sourcing, screening or even portfolio value creation, however, is in most cases either non-existing or depends on external solutions like Specter or SourceScrub. Most firms today use CRMs (although nobody seems really happy and teams jump across solutions), one or two commercial database providers like Crunchbase, Dealroom or Pitchbook (which become increasingly redundant and hard to differentiate) and simple productivity tools like Calendly, Mixmax or Superhuman. Surprisingly, I feel that the VC industry is still at day one in terms of digitization and data-driven innovation. Why are you talking so transparently about your learnings and ideas?
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |