BuY., HoweB., BalazinskaM., ErnstM.D.2010. HaLoop: efficient iterative data processing on large clusters. Proceedings of the VLDB Endowment 3(1).
4.
DasS., SismanisY., BeyerK.S., GemullaR., HaasP.J., McPhersonJ.2010. Ricardo: integrating R and hadoop. SIGMOD'10: Proceedings of the ACM SIGMOD International Conference on Management of Data, 987–998.
5.
DeanJ., GhemawatS.2004. MapReduce: simplified data processing on large clusters. Proceedings of the 6th USENIX Symposium on Operating Systems Design & Implementation (OSDI).
6.
IsardM., BudiuM., YuY., BirrellA., FetterlyD.2007. Dryad: distributed data-parallel programs from sequential building blocks. Proceedings of the European Conference on Computer Systems (EuroSys), 59–72.
7.
KhoussainovaN., KwonY., BalazinskaM., SuciuD.2011. SnipSuggest: a context-aware SQLautocomplete system. Proceedings of the VLDB Endowment 4(1).
8.
KoS.Y., HoqueI., ChoB., GuptaI.2010. Making cloud intermediate data fault-tolerant. ACM Symposium on Cloud Computing (SOCC), 181–192.
9.
KwonY., BalazinskaM., HoweB., RoliaJ.2010a. Skew-resistant parallel processing of feature-extracting scientific user-defined functions. ACM Symposium on Cloud Computing (SOCC).
10.
KwonY., NunleyD., GardnerJ.P., BalazinskaM., HoweB., LoebmanS.2010b. Scalable clustering algorithm for N-body simulations in a shared-nothing cluster. 22nd International Conference on Scientific and Statistical Database Management (SSDBM).
11.
LoebmanS., NunleyD., KwonY., HoweB., BalazinskaM., GardnerJ.P.2009. Analyzing massive astrophysical datasets: Can Pig/Hadoop or a relational DBMS help?Proceedings of the Workshop on Interfaces and Architectures for Scientific Data Storage.
12.
LogothetisD., OlstonC., ReedB., WebbK.C., YocumK.2010. Stateful bulk processing for incremental analytics. ACM Symposium on Cloud Computing (SOCC), 51–62.
13.
LuW., JacksonJ., BargaR.2010. AzureBlast: a case study of developing science applications on the cloud. Proceedings of the 1st Workshop on Scientific Cloud Computing (Science Cloud 2010).
14.
MortonK., BalazinskaM., GrossmanD.2010a. ParaTimer: a progress indicator for MapReduce DAGs. SIGMOD'10: Proceedings of the ACM SIGMOD International Conference on Management of Data.
15.
MortonK., FriesenA., BalazinskaM., GrossmanD.2010b. Estimating the progress of MapReduce pipelines. Proceedings of the 26th International Conference on Data Engineering (ICDE).
16.
OlstonC., ReedB., SrivastavaU., KumarR., TomkinsA.2008. Pig latin: a not-so-foreign language for data processing. SIGMOD'08: Proceedings of the ACM SIGMOD International Conference on Management of Data, 1099–1110.
17.
PavloA., PaulsonE., RasinA., AbadiD.J., DewittD.J., MaddenS.et al.2009. A comparison of approaches to large-scale data analysis. SIGMOD'09: Proceedings of the ACM SIGMOD International Conference on Management of Data, 165–178.
18.
PruscinoA.2003. Oracle RAC: architecture and performance. SIGMOD'03: Proceedings of the ACM SIGMOD International Conference on Management of Data, 635.
19.
RogersJ., SimakovR., SoroushE., VelikhovP., BalazinskaM., DewittD.et al.2010. Overview of SciDB: large scale array storage, processing and analysis. SIGMOD'10: Proceedings of the ACM SIGMOD International Conference on Management of Data.
20.
SchatzM.C.2009. CloudBurst: highly sensitive read mapping with MapReduce. Bioinformatics, 25:1363–1369.
21.
StonebrakerM., BeclaJ., DewittD., LimK.-T., MaierD., RatzesbergerO.et al.2009. Requirements for science data bases and SciDB. Fourth Biennial Conference on Innovative Data Systems Research (CIDR)—Perspectives.
22.
WatsonP.2010. Cloud computing for chemical property prediction. Talk at Cloud Futures.
23.
XuY., KostamaaP., GaoL.2010. Integrating hadoop and parallel dbms. SIGMOD'10: Proceedings of the ACM SIGMOD International Conference on Management of Data, 969–974.