CrunchBase, People+, and the EFF
We are in the midst of a disagreement with new mobile startup called People+. Wired.com broke the story and the Electronic Frontier Foundation (EFF) has elected to represent People+. The dispute centers on whether People+ has the right to use the entire CrunchBase dataset to build a new product directly competitive with CrunchBase. In the CrunchBase Terms of Service, we cite a Creative Commons plus attribution license as our foundation, and we also reserve, among other things, the right to restrict or prohibit use of the data under certain circumstances, such as use of the data to create a direct competitor to CrunchBase. On that basis, an AOL attorney last week asked People+ to not use our data in their product.
In a letter dated Nov. 4, EFF attorney Mitchell Stoltz argued that People+ does have the right to use the data, and made a number of points in connection with the Creative Commons license. We have a great deal of respect for the EFF, and we are meeting with Mr. Stoltz to discuss EFF’s arguments on behalf of People+. We would like to bring this issue to a quick and fair resolution, and we expect to learn a few things along the way about Creative Commons’ best practices and crowd-sourced data platforms.
We should also note that the AOL legal team, which issued a cease-and-desist letter to People+, was acting on our behalf, and with our direction. It’s true that AOL owns Crunchbase, which it acquired with TechCrunch in 2010, but AOL has given us broad authority to pursue the development of Crunchbase.
For context, the data in CrunchBase currently adds up to 488k data points about companies, people, funds, fundings and events. The data comes from a wide variety of sources. More than 53k individual contributors have added data so far this year, and the CrunchBase Venture Program now counts over 400 members who are supplying data about both legacy and new deals. We are also partnering with a number of startup databases like ILVenture to expand our coverage internationally, and we monitor thousands of Web-based resources, including TechCrunch and SEC registration data.
Given the wide range of data sources, it’s no surprise that it takes a sizeable data team to de-dupe, validate, and classify everything in CrunchBase. This year alone, for example, we have already merged 32k duplicate profiles, re-categorized 24k companies, and updated thousands of companies that have either closed or been acquired. In other words, the CrunchBase data set is widely sourced and intensively curated. If this were not the case, it would not be so useful to the 2 million people who use it each month.
Thanks to the hard work to improve CrunchBase data, we have recently received many inquiries about licensing the CrunchBase dataset for commercial use. In general, we have been happy to work with these partners, all of whom are developing interesting applications (both web and mobile) on top of the CrunchBase dataset. Our arrangements with these firms entail attribution to CrunchBase, where appropriate, and often engage the partner to contribute new data to CrunchBase. In some cases, the partner also pays a fee, which varies based on the size of the business and what data they can contribute back to CrunchBase. For example, in our partnership with AngelList, that site’s users may opt-in to share data with CrunchBase. In exchange CrunchBase carries links on relevant profiles to fund raising and employee recruiting efforts for corresponding Angel List startups.
In our view, People+ differed from this category of partnership because they used the complete CrunchBase dataset to create a product that would compete directly with CrunchBase, instead of building on top of it. Our ability to restrict that activity depends on our terms of service, and that’s what we are discussing with People+ and the EFF.
It’s important to add that the vast majority of CrunchBase heavy data consumers use the data at no charge. More than 100MM calls are made to our API each month by 5k registered API users. Only a teeny number of those are commercial or potentially commercial arrangements. We also make Excel exports of our data available for anyone to download, and tens of thousands of people do so on a regular basis. In other words CrunchBase data is remarkably easy to access at no charge for the vast majority of users.
Our vision for CrunchBase is to safeguard and advance what makes CrunchBase so valuable to the startup community: a single, nearly comprehensive and timely database of startup information that helps everyone in the community invest, partner, and work better. CrunchBase must remain open to anyone who wants to contribute, and retrieving that data for non-commercial benefit must remain open as well. That said, to invest in CrunchBase’s constant improvement requires building a business around CrunchBase in a way that successfully takes into account our terms of service and our openness. We are confident that this is possible, and that’s what we are on the path to figuring out.