Published site at d5aaeee88b331e064830a2774f4fed238631457c.
[hbase-site.git] / poweredbyhbase.html
1 <!DOCTYPE html>
2 <!--
3 | Generated by Apache Maven Doxia Site Renderer 1.6
4 | Rendered using Apache Maven Fluido Skin 1.5-HBASE
5 -->
6 <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
7 <head>
8 <meta charset="UTF-8" />
9 <meta name="viewport" content="width=device-width, initial-scale=1.0" />
10 <meta name="Date-Revision-yyyymmdd" content="20180311" />
11 <meta http-equiv="Content-Language" content="en" />
12 <title>Apache HBase &#x2013; Powered By Apache HBase\99</title>
13 <link rel="stylesheet" href="./css/apache-maven-fluido-1.5-HBASE.min.css" />
14 <link rel="stylesheet" href="./css/site.css" />
15 <link rel="stylesheet" href="./css/print.css" media="print" />
16
17
18 <script type="text/javascript" src="./js/apache-maven-fluido-1.5-HBASE.min.js"></script>
19
20
21
22 <meta name="viewport" content="width=device-width, initial-scale=1.0"></meta>
23
24
25 <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/2.3.2/css/bootstrap-responsive.min.css"/>
26
27
28 <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.9.1/styles/github.min.css"/>
29
30
31 <link rel="stylesheet" href="css/site.css"/>
32
33
34 <script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.9.1/highlight.min.js"></script>
35
36 </head>
37 <body class="topBarEnabled">
38
39
40
41
42
43 <div id="topbar" class="navbar navbar-fixed-top ">
44 <div class="navbar-inner">
45 <div class="container">
46 <a data-target=".nav-collapse" data-toggle="collapse" class="btn btn-navbar">
47 <span class="icon-bar"></span>
48 <span class="icon-bar"></span>
49 <span class="icon-bar"></span>
50 </a>
51 <div class="nav-collapse">
52
53 <ul class="nav">
54 <li class="dropdown">
55 <a href="#" class="dropdown-toggle" data-toggle="dropdown">Apache HBase Project <b class="caret"></b></a>
56 <ul class="dropdown-menu">
57
58 <li> <a href="index.html" title="Overview">Overview</a>
59 </li>
60
61 <li> <a href="license.html" title="License">License</a>
62 </li>
63
64 <li> <a href="http://www.apache.org/dyn/closer.cgi/hbase/" title="Downloads">Downloads</a>
65 </li>
66
67 <li> <a href="https://issues.apache.org/jira/browse/HBASE?report=com.atlassian.jira.plugin.system.project:changelog-panel#selectedTab=com.atlassian.jira.plugin.system.project%3Achangelog-panel" title="Release Notes">Release Notes</a>
68 </li>
69
70 <li> <a href="coc.html" title="Code Of Conduct">Code Of Conduct</a>
71 </li>
72
73 <li> <a href="http://blogs.apache.org/hbase/" title="Blog">Blog</a>
74 </li>
75
76 <li> <a href="mail-lists.html" title="Mailing Lists">Mailing Lists</a>
77 </li>
78
79 <li> <a href="team-list.html" title="Team">Team</a>
80 </li>
81
82 <li> <a href="https://reviews.apache.org/" title="ReviewBoard">ReviewBoard</a>
83 </li>
84
85 <li> <a href="sponsors.html" title="Thanks">Thanks</a>
86 </li>
87
88 <li> <a href="poweredbyhbase.html" title="Powered by HBase">Powered by HBase</a>
89 </li>
90
91 <li> <a href="resources.html" title="Other resources">Other resources</a>
92 </li>
93 </ul>
94 </li>
95 <li class="dropdown">
96 <a href="#" class="dropdown-toggle" data-toggle="dropdown">Project Information <b class="caret"></b></a>
97 <ul class="dropdown-menu">
98
99 <li> <a href="project-summary.html" title="Project Summary">Project Summary</a>
100 </li>
101
102 <li> <a href="dependency-info.html" title="Dependency Information">Dependency Information</a>
103 </li>
104
105 <li> <a href="team-list.html" title="Team">Team</a>
106 </li>
107
108 <li> <a href="source-repository.html" title="Source Repository">Source Repository</a>
109 </li>
110
111 <li> <a href="issue-tracking.html" title="Issue Tracking">Issue Tracking</a>
112 </li>
113
114 <li> <a href="dependency-management.html" title="Dependency Management">Dependency Management</a>
115 </li>
116
117 <li> <a href="dependencies.html" title="Dependencies">Dependencies</a>
118 </li>
119
120 <li> <a href="dependency-convergence.html" title="Dependency Convergence">Dependency Convergence</a>
121 </li>
122
123 <li> <a href="integration.html" title="Continuous Integration">Continuous Integration</a>
124 </li>
125
126 <li> <a href="plugin-management.html" title="Plugin Management">Plugin Management</a>
127 </li>
128
129 <li> <a href="plugins.html" title="Plugins">Plugins</a>
130 </li>
131 </ul>
132 </li>
133 <li class="dropdown">
134 <a href="#" class="dropdown-toggle" data-toggle="dropdown">Documentation and API <b class="caret"></b></a>
135 <ul class="dropdown-menu">
136
137 <li> <a href="book.html" target="_blank" title="Reference Guide">Reference Guide</a>
138 </li>
139
140 <li> <a href="apache_hbase_reference_guide.pdf" target="_blank" title="Reference Guide (PDF)">Reference Guide (PDF)</a>
141 </li>
142
143 <li> <a href="book.html#quickstart" target="_blank" title="Getting Started">Getting Started</a>
144 </li>
145
146 <li> <a href="apidocs/index.html" target="_blank" title="User API">User API</a>
147 </li>
148
149 <li> <a href="testapidocs/index.html" target="_blank" title="User API (Test)">User API (Test)</a>
150 </li>
151
152 <li> <a href="devapidocs/index.html" target="_blank" title="Developer API">Developer API</a>
153 </li>
154
155 <li> <a href="testdevapidocs/index.html" target="_blank" title="Developer API (Test)">Developer API (Test)</a>
156 </li>
157
158 <li> <a href="http://abloz.com/hbase/book.html" target="_blank" title="中文参考指南(单页)">中文参考指南(单页)</a>
159 </li>
160
161 <li> <a href="book.html#faq" target="_blank" title="FAQ">FAQ</a>
162 </li>
163
164 <li> <a href="book.html#other.info" target="_blank" title="Videos/Presentations">Videos/Presentations</a>
165 </li>
166
167 <li> <a href="http://wiki.apache.org/hadoop/Hbase" target="_blank" title="Wiki">Wiki</a>
168 </li>
169
170 <li> <a href="acid-semantics.html" target="_blank" title="ACID Semantics">ACID Semantics</a>
171 </li>
172
173 <li> <a href="book.html#arch.bulk.load" target="_blank" title="Bulk Loads">Bulk Loads</a>
174 </li>
175
176 <li> <a href="metrics.html" target="_blank" title="Metrics">Metrics</a>
177 </li>
178
179 <li> <a href="cygwin.html" target="_blank" title="HBase on Windows">HBase on Windows</a>
180 </li>
181
182 <li> <a href="book.html#replication" target="_blank" title="Cluster replication">Cluster replication</a>
183 </li>
184
185 <li class="dropdown-submenu">
186 <a href="" title="1.2 Documentation">1.2 Documentation</a>
187 <ul class="dropdown-menu">
188 <li> <a href="1.2/apidocs/index.html" target="_blank" title="API">API</a>
189 </li>
190 <li> <a href="1.2/xref/index.html" target="_blank" title="X-Ref">X-Ref</a>
191 </li>
192 <li> <a href="1.2/book.html" target="_blank" title="Ref Guide (single-page)">Ref Guide (single-page)</a>
193 </li>
194 </ul>
195 </li>
196 </ul>
197 </li>
198 <li class="dropdown">
199 <a href="#" class="dropdown-toggle" data-toggle="dropdown">ASF <b class="caret"></b></a>
200 <ul class="dropdown-menu">
201
202 <li> <a href="http://www.apache.org/foundation/" target="_blank" title="Apache Software Foundation">Apache Software Foundation</a>
203 </li>
204
205 <li> <a href="http://www.apache.org/foundation/how-it-works.html" target="_blank" title="How Apache Works">How Apache Works</a>
206 </li>
207
208 <li> <a href="http://www.apache.org/foundation/sponsorship.html" target="_blank" title="Sponsoring Apache">Sponsoring Apache</a>
209 </li>
210 </ul>
211 </li>
212 </ul>
213
214 <div id="search-form" class="navbar-search pull-right">
215 <script type="text/javascript">
216 var cx = '000385458301414556862:sq1bb0xugjg';
217
218 (function() {
219 var gcse = document.createElement('script'); gcse.type = 'text/javascript'; gcse.async = true;
220 gcse.src = (document.location.protocol == 'https:' ? 'https:' : 'http:') + '//cse.google.com/cse.js?cx=' + cx;
221 var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(gcse, s);
222 })();
223
224 </script>
225 <gcse:search></gcse:search>
226 </div>
227
228
229
230 </div>
231
232 </div>
233 </div>
234 </div>
235
236 <div class="container">
237 <div id="banner">
238 <div class="pull-left">
239 <a href="./" id="bannerLeft">
240 <img src="" alt=""/>
241 </a>
242 </div>
243 <div class="pull-right"> <a href="./" id="bannerRight">
244 <img src="images/hbase_logo_with_orca_large.png" alt="Apache HBase"/>
245 </a>
246 </div>
247 <div class="clear"><hr/></div>
248 </div>
249
250 <div id="breadcrumbs">
251 <ul class="breadcrumb">
252
253
254
255
256
257
258 </ul>
259 </div>
260
261
262
263 <div id="bodyColumn" >
264
265 <!-- Licensed to the Apache Software Foundation (ASF) under one
266 or more contributor license agreements. See the NOTICE file
267 distributed with this work for additional information
268 regarding copyright ownership. The ASF licenses this file
269 to you under the Apache License, Version 2.0 (the
270 "License"); you may not use this file except in compliance
271 with the License. You may obtain a copy of the License at
272
273 http://www.apache.org/licenses/LICENSE-2.0
274
275 Unless required by applicable law or agreed to in writing,
276 software distributed under the License is distributed on an
277 "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
278 KIND, either express or implied. See the License for the
279 specific language governing permissions and limitations
280 under the License. -->
281
282 <div class="section">
283 <h2><a name="Powered_By_Apache_HBase"></a>Powered By Apache HBase&#x99;</h2>
284
285 <p>This page lists some institutions and projects which are using HBase. To
286 have your organization added, file a documentation JIRA or email
287 <a class="externalLink" href="mailto:dev@hbase.apache.org">hbase-dev</a> with the relevant
288 information. If you notice out-of-date information, use the same avenues to
289 report it.
290 </p>
291
292 <p><b>These items are user-submitted and the HBase team assumes no responsibility for their accuracy.</b></p>
293
294 <dl>
295
296 <dt><a class="externalLink" href="http://www.adobe.com">Adobe</a></dt>
297
298 <dd>We currently have about 30 nodes running HDFS, Hadoop and HBase in clusters
299 ranging from 5 to 14 nodes on both production and development. We plan a
300 deployment on an 80 nodes cluster. We are using HBase in several areas from
301 social services to structured data and processing for internal use. We constantly
302 write data to HBase and run mapreduce jobs to process then store it back to
303 HBase or external systems. Our production cluster has been running since Oct 2008.</dd>
304
305
306 <dt><a class="externalLink" href="http://spark-packages.org/package/Huawei-Spark/Spark-SQL-on-HBase">Project Astro</a></dt>
307
308 <dd>
309 Astro provides fast Spark SQL/DataFrame capabilities to HBase data,
310 featuring super-efficient access to multi-dimensional HBase rows through
311 native Spark execution in HBase coprocessor plus systematic and accurate
312 partition pruning and predicate pushdown from arbitrarily complex data
313 filtering logic. The batch load is optimized to run on the Spark execution
314 engine. Note that <a class="externalLink" href="http://spark-packages.org/package/Huawei-Spark/Spark-SQL-on-HBase">Spark-SQL-on-HBase</a>
315 is the release site. Interested parties are free to make clones and claim
316 to be &quot;latest(and active)&quot;, but they are not endorsed by the owner.
317 </dd>
318
319
320 <dt><a class="externalLink" href="http://axibase.com/products/axibase-time-series-database/">Axibase
321 Time Series Database (ATSD)</a></dt>
322
323 <dd>ATSD runs on top of HBase to collect, analyze and visualize time series
324 data at scale. ATSD capabilities include optimized storage schema, built-in
325 rule engine, forecasting algorithms (Holt-Winters and ARIMA) and next-generation
326 graphics designed for high-frequency data. Primary use cases: IT infrastructure
327 monitoring, data consolidation, operational historian in OPC environments.</dd>
328
329
330 <dt><a class="externalLink" href="http://www.benipaltechnologies.com">Benipal Technologies</a></dt>
331
332 <dd>We have a 35 node cluster used for HBase and Mapreduce with Lucene / SOLR
333 and katta integration to create and finetune our search databases. Currently,
334 our HBase installation has over 10 Billion rows with 100s of datapoints per row.
335 We compute over 10<sup>18</sup> calculations daily using MapReduce directly on HBase. We
336 heart HBase.</dd>
337
338
339 <dt><a class="externalLink" href="https://github.com/ermanpattuk/BigSecret">BigSecret</a></dt>
340
341 <dd>BigSecret is a security framework that is designed to secure Key-Value data,
342 while preserving efficient processing capabilities. It achieves cell-level
343 security, using combinations of different cryptographic techniques, in an
344 efficient and secure manner. It provides a wrapper library around HBase.</dd>
345
346
347 <dt><a class="externalLink" href="http://caree.rs">Caree.rs</a></dt>
348
349 <dd>Accelerated hiring platform for HiTech companies. We use HBase and Hadoop
350 for all aspects of our backend - job and company data storage, analytics
351 processing, machine learning algorithms for our hire recommendation engine.
352 Our live production site is directly served from HBase. We use cascading for
353 running offline data processing jobs.</dd>
354
355
356 <dt><a class="externalLink" href="http://www.celer-tech.com/">Celer Technologies</a></dt>
357
358 <dd>Celer Technologies is a global financial software company that creates
359 modular-based systems that have the flexibility to meet tomorrow's business
360 environment, today. The Celer framework uses Hadoop/HBase for storing all
361 financial data for trading, risk, clearing in a single data store. With our
362 flexible framework and all the data in Hadoop/HBase, clients can build new
363 features to quickly extract data based on their trading, risk and clearing
364 activities from one single location.</dd>
365
366
367 <dt><a class="externalLink" href="http://www.explorys.net">Explorys</a></dt>
368
369 <dd>Explorys uses an HBase cluster containing over a billion anonymized clinical
370 records, to enable subscribers to search and analyze patient populations,
371 treatment protocols, and clinical outcomes.</dd>
372
373
374 <dt><a class="externalLink" href="http://www.facebook.com/notes/facebook-engineering/the-underlying-technology-of-messages/454991608919">Facebook</a></dt>
375
376 <dd>Facebook uses HBase to power their Messages infrastructure.</dd>
377
378
379 <dt><a class="externalLink" href="http://www.filmweb.pl">Filmweb</a></dt>
380
381 <dd>Filmweb is a film web portal with a large dataset of films, persons and
382 movie-related entities. We have just started a small cluster of 3 HBase nodes
383 to handle our web cache persistency layer. We plan to increase the cluster
384 size, and also to start migrating some of the data from our databases which
385 have some demanding scalability requirements.</dd>
386
387
388 <dt><a class="externalLink" href="http://www.flurry.com">Flurry</a></dt>
389
390 <dd>Flurry provides mobile application analytics. We use HBase and Hadoop for
391 all of our analytics processing, and serve all of our live requests directly
392 out of HBase on our 50 node production cluster with tens of billions of rows
393 over several tables.</dd>
394
395
396 <dt><a class="externalLink" href="http://gumgum.com">GumGum</a></dt>
397
398 <dd>GumGum is an In-Image Advertising Platform. We use HBase on an 15-node
399 Amazon EC2 High-CPU Extra Large (c1.xlarge) cluster for both real-time data
400 and analytics. Our production cluster has been running since June 2010.</dd>
401
402
403 <dt><a class="externalLink" href="http://helprace.com/help-desk/">Helprace</a></dt>
404
405 <dd>Helprace is a customer service platform which uses Hadoop for analytics
406 and internal searching and filtering. Being on HBase we can share our HBase
407 and Hadoop cluster with other Hadoop processes - this particularly helps in
408 keeping community speeds up. We use Hadoop and HBase on small cluster with 4
409 cores and 32 GB RAM each.</dd>
410
411
412 <dt><a class="externalLink" href="http://hubspot.com">HubSpot</a></dt>
413
414 <dd>HubSpot is an online marketing platform, providing analytics, email, and
415 segmentation of leads/contacts. HBase is our primary datastore for our customers'
416 customer data, with multiple HBase clusters powering the majority of our
417 product. We have nearly 200 regionservers across the various clusters, and
418 2 hadoop clusters also with nearly 200 tasktrackers. We use c1.xlarge in EC2
419 for both, but are starting to move some of that to baremetal hardware. We've
420 been running HBase for over 2 years.</dd>
421
422
423 <dt><a class="externalLink" href="http://www.infolinks.com/">Infolinks</a></dt>
424
425 <dd>Infolinks is an In-Text ad provider. We use HBase to process advertisement
426 selection and user events for our In-Text ad network. The reports generated
427 from HBase are used as feedback for our production system to optimize ad
428 selection.</dd>
429
430
431 <dt><a class="externalLink" href="http://www.kalooga.com">Kalooga</a></dt>
432
433 <dd>Kalooga is a discovery service for image galleries. We use Hadoop, HBase
434 and Pig on a 20-node cluster for our crawling, analysis and events
435 processing.</dd>
436
437
438 <dt><a class="externalLink" href="http://www.leanxcale.com/">LeanXcale</a></dt>
439
440 <dd>LeanXcale provides an ultra-scalable transactional &amp; SQL database that
441 stores its data on HBase and it is able to scale to 1000s of nodes. It
442 also provides a standalone full ACID HBase with transactions across
443 arbitrary sets of rows and tables.</dd>
444
445
446
447 <dt><a class="externalLink" href="http://www.mahalo.com">Mahalo</a></dt>
448
449 <dd>Mahalo, &quot;...the world's first human-powered search engine&quot;. All the markup
450 that powers the wiki is stored in HBase. It's been in use for a few months now.
451 MediaWiki - the same software that power Wikipedia - has version/revision control.
452 Mahalo's in-house editors produce a lot of revisions per day, which was not
453 working well in a RDBMS. An hbase-based solution for this was built and tested,
454 and the data migrated out of MySQL and into HBase. Right now it's at something
455 like 6 million items in HBase. The upload tool runs every hour from a shell
456 script to back up that data, and on 6 nodes takes about 5-10 minutes to run -
457 and does not slow down production at all.</dd>
458
459
460 <dt><a class="externalLink" href="http://www.meetup.com">Meetup</a></dt>
461
462 <dd>Meetup is on a mission to help the world&#x2019;s people self-organize into local
463 groups. We use Hadoop and HBase to power a site-wide, real-time activity
464 feed system for all of our members and groups. Group activity is written
465 directly to HBase, and indexed per member, with the member's custom feed
466 served directly from HBase for incoming requests. We're running HBase
467 0.20.0 on a 11 node cluster.</dd>
468
469
470 <dt><a class="externalLink" href="http://www.mendeley.com">Mendeley</a></dt>
471
472 <dd>Mendeley is creating a platform for researchers to collaborate and share
473 their research online. HBase is helping us to create the world's largest
474 research paper collection and is being used to store all our raw imported data.
475 We use a lot of map reduce jobs to process these papers into pages displayed
476 on the site. We also use HBase with Pig to do analytics and produce the article
477 statistics shown on the web site. You can find out more about how we use HBase
478 in the <a class="externalLink" href="http://www.slideshare.net/danharvey/hbase-at-mendeley">HBase
479 At Mendeley</a> slide presentation.</dd>
480
481
482 <dt><a class="externalLink" href="http://www.ngdata.com">NGDATA</a></dt>
483
484 <dd>NGDATA delivers <a class="externalLink" href="http://www.ngdata.com/site/products/lily.html">Lily</a>,
485 the consumer intelligence solution that delivers a unique combination of Big
486 Data management, machine learning technologies and consumer intelligence
487 applications in one integrated solution to allow better, and more dynamic,
488 consumer insights. Lily allows companies to process and analyze massive structured
489 and unstructured data, scale storage elastically and locate actionable data
490 quickly from large data sources in near real time.</dd>
491
492
493 <dt><a class="externalLink" href="http://ning.com">Ning</a></dt>
494
495 <dd>Ning uses HBase to store and serve the results of processing user events
496 and log files, which allows us to provide near-real time analytics and
497 reporting. We use a small cluster of commodity machines with 4 cores and 16GB
498 of RAM per machine to handle all our analytics and reporting needs.</dd>
499
500
501 <dt><a class="externalLink" href="http://www.worldcat.org">OCLC</a></dt>
502
503 <dd>OCLC uses HBase as the main data store for WorldCat, a union catalog which
504 aggregates the collections of 72,000 libraries in 112 countries and territories.
505 WorldCat is currently comprised of nearly 1 billion records with nearly 2
506 billion library ownership indications. We're running a 50 Node HBase cluster
507 and a separate offline map-reduce cluster.</dd>
508
509
510 <dt><a class="externalLink" href="http://olex.openlogic.com">OpenLogic</a></dt>
511
512 <dd>OpenLogic stores all the world's Open Source packages, versions, files,
513 and lines of code in HBase for both near-real-time access and analytical
514 purposes. The production cluster has well over 100TB of disk spread across
515 nodes with 32GB+ RAM and dual-quad or dual-hex core CPU's.</dd>
516
517
518 <dt><a class="externalLink" href="http://www.openplaces.org">Openplaces</a></dt>
519
520 <dd>Openplaces is a search engine for travel that uses HBase to store terabytes
521 of web pages and travel-related entity records (countries, cities, hotels,
522 etc.). We have dozens of MapReduce jobs that crunch data on a daily basis.
523 We use a 20-node cluster for development, a 40-node cluster for offline
524 production processing and an EC2 cluster for the live web site.</dd>
525
526
527 <dt><a class="externalLink" href="http://www.pnl.gov">Pacific Northwest National Laboratory</a></dt>
528
529 <dd>Hadoop and HBase (Cloudera distribution) are being used within PNNL's
530 Computational Biology &amp; Bioinformatics Group for a systems biology data
531 warehouse project that integrates high throughput proteomics and transcriptomics
532 data sets coming from instruments in the Environmental Molecular Sciences
533 Laboratory, a US Department of Energy national user facility located at PNNL.
534 The data sets are being merged and annotated with other public genomics
535 information in the data warehouse environment, with Hadoop analysis programs
536 operating on the annotated data in the HBase tables. This work is hosted by
537 <a class="externalLink" href="http://www.pnl.gov/news/release.aspx?id=908">olympus</a>, a large PNNL
538 institutional computing cluster, with the HBase tables being stored in olympus's
539 Lustre file system.</dd>
540
541
542 <dt><a class="externalLink" href="http://www.readpath.com/">ReadPath</a></dt>
543
544 <dd>|ReadPath uses HBase to store several hundred million RSS items and dictionary
545 for its RSS newsreader. Readpath is currently running on an 8 node cluster.</dd>
546
547
548 <dt><a class="externalLink" href="http://resu.me/">resu.me</a></dt>
549
550 <dd>Career network for the net generation. We use HBase and Hadoop for all
551 aspects of our backend - user and resume data storage, analytics processing,
552 machine learning algorithms for our job recommendation engine. Our live
553 production site is directly served from HBase. We use cascading for running
554 offline data processing jobs.</dd>
555
556
557 <dt><a class="externalLink" href="http://www.runa.com/">Runa Inc.</a></dt>
558
559 <dd>Runa Inc. offers a SaaS that enables online merchants to offer dynamic
560 per-consumer, per-product promotions embedded in their website. To implement
561 this we collect the click streams of all their visitors to determine along
562 with the rules of the merchant what promotion to offer the visitor at different
563 points of their browsing the Merchant website. So we have lots of data and have
564 to do lots of off-line and real-time analytics. HBase is the core for us.
565 We also use Clojure and our own open sourced distributed processing framework,
566 Swarmiji. The HBase Community has been key to our forward movement with HBase.
567 We're looking for experienced developers to join us to help make things go even
568 faster!</dd>
569
570
571 <dt><a class="externalLink" href="http://www.sematext.com/">Sematext</a></dt>
572
573 <dd>Sematext runs
574 <a class="externalLink" href="http://www.sematext.com/search-analytics/index.html">Search Analytics</a>,
575 a service that uses HBase to store search activity and MapReduce to produce
576 reports showing user search behaviour and experience. Sematext runs
577 <a class="externalLink" href="http://www.sematext.com/spm/index.html">Scalable Performance Monitoring (SPM)</a>,
578 a service that uses HBase to store performance data over time, crunch it with
579 the help of MapReduce, and display it in a visually rich browser-based UI.
580 Interestingly, SPM features
581 <a class="externalLink" href="http://www.sematext.com/spm/hbase-performance-monitoring/index.html">SPM for HBase</a>,
582 which is specifically designed to monitor all HBase performance metrics.</dd>
583
584
585 <dt><a class="externalLink" href="http://www.socialmedia.com/">SocialMedia</a></dt>
586
587 <dd>SocialMedia uses HBase to store and process user events which allows us to
588 provide near-realtime user metrics and reporting. HBase forms the heart of
589 our Advertising Network data storage and management system. We use HBase as
590 a data source and sink for both realtime request cycle queries and as a
591 backend for mapreduce analysis.</dd>
592
593
594 <dt><a class="externalLink" href="http://www.splicemachine.com/">Splice Machine</a></dt>
595
596 <dd>Splice Machine is built on top of HBase. Splice Machine is a full-featured
597 ANSI SQL database that provides real-time updates, secondary indices, ACID
598 transactions, optimized joins, triggers, and UDFs.</dd>
599
600
601 <dt><a class="externalLink" href="http://www.streamy.com/">Streamy</a></dt>
602
603 <dd>Streamy is a recently launched realtime social news site. We use HBase
604 for all of our data storage, query, and analysis needs, replacing an existing
605 SQL-based system. This includes hundreds of millions of documents, sparse
606 matrices, logs, and everything else once done in the relational system. We
607 perform significant in-memory caching of query results similar to a traditional
608 Memcached/SQL setup as well as other external components to perform joining
609 and sorting. We also run thousands of daily MapReduce jobs using HBase tables
610 for log analysis, attention data processing, and feed crawling. HBase has
611 helped us scale and distribute in ways we could not otherwise, and the
612 community has provided consistent and invaluable assistance.</dd>
613
614
615 <dt><a class="externalLink" href="http://www.stumbleupon.com/">Stumbleupon</a></dt>
616
617 <dd>Stumbleupon and <a class="externalLink" href="http://su.pr">Su.pr</a> use HBase as a real time
618 data storage and analytics platform. Serving directly out of HBase, various site
619 features and statistics are kept up to date in a real time fashion. We also
620 use HBase a map-reduce data source to overcome traditional query speed limits
621 in MySQL.</dd>
622
623
624 <dt><a class="externalLink" href="http://www.tokenizer.org">Shopping Engine at Tokenizer</a></dt>
625
626 <dd>Shopping Engine at Tokenizer is a web crawler; it uses HBase to store URLs
627 and Outlinks (AnchorText + LinkedURL): more than a billion. It was initially
628 designed as Nutch-Hadoop extension, then (due to very specific 'shopping'
629 scenario) moved to SOLR + MySQL(InnoDB) (ten thousands queries per second),
630 and now - to HBase. HBase is significantly faster due to: no need for huge
631 transaction logs, column-oriented design exactly matches 'lazy' business logic,
632 data compression, !MapReduce support. Number of mutable 'indexes' (term from
633 RDBMS) significantly reduced due to the fact that each 'row::column' structure
634 is physically sorted by 'row'. MySQL InnoDB engine is best DB choice for
635 highly-concurrent updates. However, necessity to flash a block of data to
636 harddrive even if we changed only few bytes is obvious bottleneck. HBase
637 greatly helps: not-so-popular in modern DBMS 'delete-insert', 'mutable primary
638 key', and 'natural primary key' patterns become a big advantage with HBase.</dd>
639
640
641 <dt><a class="externalLink" href="http://traackr.com/">Traackr</a></dt>
642
643 <dd>Traackr uses HBase to store and serve online influencer data in real-time.
644 We use MapReduce to frequently re-score our entire data set as we keep updating
645 influencer metrics on a daily basis.</dd>
646
647
648 <dt><a class="externalLink" href="http://trendmicro.com/">Trend Micro</a></dt>
649
650 <dd>Trend Micro uses HBase as a foundation for cloud scale storage for a variety
651 of applications. We have been developing with HBase since version 0.1 and
652 production since version 0.20.0.</dd>
653
654
655 <dt><a class="externalLink" href="http://www.twitter.com">Twitter</a></dt>
656
657 <dd>Twitter runs HBase across its entire Hadoop cluster. HBase provides a
658 distributed, read/write backup of all mysql tables in Twitter's production
659 backend, allowing engineers to run MapReduce jobs over the data while maintaining
660 the ability to apply periodic row updates (something that is more difficult
661 to do with vanilla HDFS). A number of applications including people search
662 rely on HBase internally for data generation. Additionally, the operations
663 team uses HBase as a timeseries database for cluster-wide monitoring/performance
664 data.</dd>
665
666
667 <dt><a class="externalLink" href="http://www.udanax.org">Udanax.org</a></dt>
668
669 <dd>Udanax.org is a URL shortener which use 10 nodes HBase cluster to store URLs,
670 Web Log data and response the real-time request on its Web Server. This
671 application is now used for some twitter clients and a number of web sites.
672 Currently API requests are almost 30 per second and web redirection requests
673 are about 300 per second.</dd>
674
675
676 <dt><a class="externalLink" href="http://www.veoh.com/">Veoh Networks</a></dt>
677
678 <dd>Veoh Networks uses HBase to store and process visitor (human) and entity
679 (non-human) profiles which are used for behavioral targeting, demographic
680 detection, and personalization services. Our site reads this data in
681 real-time (heavily cached) and submits updates via various batch map/reduce
682 jobs. With 25 million unique visitors a month storing this data in a traditional
683 RDBMS is not an option. We currently have a 24 node Hadoop/HBase cluster and
684 our profiling system is sharing this cluster with our other Hadoop data
685 pipeline processes.</dd>
686
687
688 <dt><a class="externalLink" href="http://www.videosurf.com/">VideoSurf</a></dt>
689
690 <dd>VideoSurf - &quot;The video search engine that has taught computers to see&quot;.
691 We're using HBase to persist various large graphs of data and other statistics.
692 HBase was a real win for us because it let us store substantially larger
693 datasets without the need for manually partitioning the data and its
694 column-oriented nature allowed us to create schemas that were substantially
695 more efficient for storing and retrieving data.</dd>
696
697
698 <dt><a class="externalLink" href="http://www.visibletechnologies.com/">Visible Technologies</a></dt>
699
700 <dd>Visible Technologies uses Hadoop, HBase, Katta, and more to collect, parse,
701 store, and search hundreds of millions of Social Media content. We get incredibly
702 fast throughput and very low latency on commodity hardware. HBase enables our
703 business to exist.</dd>
704
705
706 <dt><a class="externalLink" href="http://www.worldlingo.com/">WorldLingo</a></dt>
707
708 <dd>The WorldLingo Multilingual Archive. We use HBase to store millions of
709 documents that we scan using Map/Reduce jobs to machine translate them into
710 all or selected target languages from our set of available machine translation
711 languages. We currently store 12 million documents but plan to eventually
712 reach the 450 million mark. HBase allows us to scale out as we need to grow
713 our storage capacities. Combined with Hadoop to keep the data replicated and
714 therefore fail-safe we have the backbone our service can rely on now and in
715 the future. !WorldLingo is using HBase since December 2007 and is along with
716 a few others one of the longest running HBase installation. Currently we are
717 running the latest HBase 0.20 and serving directly from it at
718 <a class="externalLink" href="http://www.worldlingo.com/ma/enwiki/en/HBase">MultilingualArchive</a>.</dd>
719
720
721 <dt><a class="externalLink" href="http://www.yahoo.com/">Yahoo!</a></dt>
722
723 <dd>Yahoo! uses HBase to store document fingerprint for detecting near-duplications.
724 We have a cluster of few nodes that runs HDFS, mapreduce, and HBase. The table
725 contains millions of rows. We use this for querying duplicated documents with
726 realtime traffic.</dd>
727
728
729 <dt><a class="externalLink" href="http://h50146.www5.hp.com/products/software/security/icewall/eng/">HP IceWall SSO</a></dt>
730
731 <dd>HP IceWall SSO is a web-based single sign-on solution and uses HBase to store
732 user data to authenticate users. We have supported RDB and LDAP previously but
733 have newly supported HBase with a view to authenticate over tens of millions
734 of users and devices.</dd>
735
736
737 <dt><a class="externalLink" href="http://www.ymc.ch/en/big-data-analytics-en?utm_source=hadoopwiki&amp;utm_medium=poweredbypage&amp;utm_campaign=ymc.ch">YMC AG</a></dt>
738
739 <dd>
740 <ul>
741
742 <li>operating a Cloudera Hadoop/HBase cluster for media monitoring purpose</li>
743
744 <li>offering technical and operative consulting for the Hadoop stack + ecosystem</li>
745
746 <li>editor of <a class="externalLink" href="http://www.ymc.ch/en/hbase-split-visualisation-introducing-hannibal?utm_source=hadoopwiki&amp;utm_medium=poweredbypageamp;utm_campaign=ymc.ch">Hannibal</a>, a open-source tool
747 to visualize HBase regions sizes and splits that helps running HBase in production</li>
748 </ul></dd>
749 </dl>
750 </div>
751
752
753 </div>
754 </div>
755
756 <hr/>
757
758 <footer>
759 <div class="container">
760 <div class="row">
761 <p >Copyright &copy; 2007&#x2013;2018
762 <a href="https://www.apache.org/">The Apache Software Foundation</a>.
763 All rights reserved.
764
765 <li id="publishDate" class="pull-right">Last Published: 2018-03-11</li>
766 </p>
767 </div>
768
769 <p id="poweredBy" class="pull-right">
770 <a href="http://maven.apache.org/" title="Built by Maven" class="poweredBy">
771 <img class="builtBy" alt="Built by Maven" src="./images/logos/maven-feather.png" />
772 </a>
773 </p>
774
775 </div>
776 </footer>
777 </body>
778 </html>