Sage Journals: Discover world-class research

Abstract

Get full access to this article

View all access options for this article.

References

Babu

2010. Towards automatic optimization of MapReduce programs. ACM Symposium on Cloud Computing (SOCC), 137–142.

Boeckmann

, Bairoch

, Apweiler

, Blatter

M.C.

, Estreicher

, Gasteiger

et al. 2003. The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res, 31:365–370. http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve\&db=pubmed\&dopt=Abstract\&list_uids=12520024.

, Howe

, Balazinska

, Ernst

M.D.

2010. HaLoop: efficient iterative data processing on large clusters. Proceedings of the VLDB Endowment 3(1).

Das

, Sismanis

, Beyer

K.S.

, Gemulla

, Haas

P.J.

, McPherson

2010. Ricardo: integrating R and hadoop. SIGMOD'10: Proceedings of the ACM SIGMOD International Conference on Management of Data, 987–998.

Dean

, Ghemawat

2004. MapReduce: simplified data processing on large clusters. Proceedings of the 6th USENIX Symposium on Operating Systems Design & Implementation (OSDI).

Isard

, Budiu

, Yu

, Birrell

, Fetterly

2007. Dryad: distributed data-parallel programs from sequential building blocks. Proceedings of the European Conference on Computer Systems (EuroSys), 59–72.

Khoussainova

, Kwon

, Balazinska

, Suciu

2011. SnipSuggest: a context-aware SQLautocomplete system. Proceedings of the VLDB Endowment 4(1).

S.Y.

, Hoque

, Cho

, Gupta

2010. Making cloud intermediate data fault-tolerant. ACM Symposium on Cloud Computing (SOCC), 181–192.

Kwon

, Balazinska

, Howe

, Rolia

2010a. Skew-resistant parallel processing of feature-extracting scientific user-defined functions. ACM Symposium on Cloud Computing (SOCC).

10.

Kwon

, Nunley

, Gardner

J.P.

, Balazinska

, Howe

, Loebman

2010b. Scalable clustering algorithm for N-body simulations in a shared-nothing cluster. 22nd International Conference on Scientific and Statistical Database Management (SSDBM).

11.

Loebman

, Nunley

, Kwon

, Howe

, Balazinska

, Gardner

J.P.

2009. Analyzing massive astrophysical datasets: Can Pig/Hadoop or a relational DBMS help? Proceedings of the Workshop on Interfaces and Architectures for Scientific Data Storage.

12.

Logothetis

, Olston

, Reed

, Webb

K.C.

, Yocum

2010. Stateful bulk processing for incremental analytics. ACM Symposium on Cloud Computing (SOCC), 51–62.

13.

, Jackson

, Barga

2010. AzureBlast: a case study of developing science applications on the cloud. Proceedings of the 1st Workshop on Scientific Cloud Computing (Science Cloud 2010).

14.

Morton

, Balazinska

, Grossman

2010a. ParaTimer: a progress indicator for MapReduce DAGs. SIGMOD'10: Proceedings of the ACM SIGMOD International Conference on Management of Data.

15.

Morton

, Friesen

, Balazinska

, Grossman

2010b. Estimating the progress of MapReduce pipelines. Proceedings of the 26th International Conference on Data Engineering (ICDE).

16.

Olston

, Reed

, Srivastava

, Kumar

, Tomkins

2008. Pig latin: a not-so-foreign language for data processing. SIGMOD'08: Proceedings of the ACM SIGMOD International Conference on Management of Data, 1099–1110.

17.

Pavlo

, Paulson

, Rasin

, Abadi

D.J.

, Dewitt

D.J.

, Madden

et al. 2009. A comparison of approaches to large-scale data analysis. SIGMOD'09: Proceedings of the ACM SIGMOD International Conference on Management of Data, 165–178.

18.

Pruscino

2003. Oracle RAC: architecture and performance. SIGMOD'03: Proceedings of the ACM SIGMOD International Conference on Management of Data, 635.

19.

Rogers

, Simakov

, Soroush

, Velikhov

, Balazinska

, Dewitt

et al. 2010. Overview of SciDB: large scale array storage, processing and analysis. SIGMOD'10: Proceedings of the ACM SIGMOD International Conference on Management of Data.

20.

Schatz

M.C.

2009. CloudBurst: highly sensitive read mapping with MapReduce. Bioinformatics, 25:1363–1369.

21.

Stonebraker

, Becla

, Dewitt

, Lim

K.-T.

, Maier

, Ratzesberger

et al. 2009. Requirements for science data bases and SciDB. Fourth Biennial Conference on Innovative Data Systems Research (CIDR)—Perspectives.

22.

Watson

2010. Cloud computing for chemical property prediction. Talk at Cloud Futures.

23.

, Kostamaa

, Gao

2010. Integrating hadoop and parallel dbms. SIGMOD'10: Proceedings of the ACM SIGMOD International Conference on Management of Data, 969–974.

Data Management Tools for Scientific Analytics

Abstract

Get full access to this article

References