• Home
    • Phil-Wilkins.uk
  • About
    • Presenting Activities
    • http://phil-wilkins.uk/
    • LinkedIn
  • Books & Publications
    • Logging in Action with Fluentd, Kubernetes and More
      • Logging in Action with Fluentd – Book
      • Fluentd Book Resources
      • Log Generator
    • API & API Platform
      • API Useful Resources
    • Oracle Integration
      • Book Website
      • Useful Reading Sources
  • Resources
    • GitHub
    • Mindmaps Index
    • Patterns Sources
    • Oracle Integration Site

Phil (aka MP3Monster)'s Blog

~ from Technology to Music

Phil (aka MP3Monster)'s Blog

Daily Archives: April 22, 2014

Oracle Big Data Handbook – part 2 reviewed

22 Tuesday Apr 2014

Posted by mp3monster in Book Reviews, General, Oracle, Oracle Press, Technology

≈ 2 Comments

Tags

adaptors, BDA, BerkeleyDB, Big Data, book, connectors, Hadoop, NoSQL, ODI, Oracle, Oracle Press, review, Sleepycat, sqoop

After an excellent start in Part 1 of Oracle Press’ Oracle Big Data Handbook (reviewed here). Part 2 moves on to looking at Apache Hadoop, Oracle’s Big Data Appliance and Oracle’s NoSQL offerings.

So chapter 3 provides a brilliant overview of Hadoop and the echo system that has been developed around it. Addressing the divergent versions of Map Reduce leading to the likes of YARN. Touching on how commericalised versions of Hadoop have been taken forward with this (such as Cloudera).

 

Apache Hadoop echo system

Moving onto to describe the core solution components such as Node Managers and the relationship to hardware and the use of more commodity kit rather than using nice expensive SAN technology.

Hadoop Structure

So now we have good (pretty much uncoloured by Oracle) view of Hadoop. Which leads into the the next chapter (chapter 4) which looks at why Oracle have taken the approach of an Appliance (which could be seen as contrary to the previous stated adoption of commodity kit).

Oracle Big Data

So as you can see Oracle woven together a set of technologies into an Exadata based platform which would not only deal with Big Data Analytics but ideally support other volume scenario needs so you’re not adding another data silo. all of which fits with Oracle’s Engineered Solutions view point. The book takes on a explains the other factors involved in the BDA design – those of commercial considerations and value propositions in relation to its customer base – very refreshing to see (rather than rationalisation through technical arguments alone).

The book addresses the challenge of why should I go to Oracle for big data? Which is well argued on the experience of very large relational deployments. Oracle’s contributions to Hadoop via Cloudera and so on. The chapter finishes with the argument around cost comparison between buying a comparable hardware solution to build your own cluster. Taking just list prices compared to HP and the hardware costs come in more or less the same, that’s before you account for the fact the Oracle price includes all the software.

 

Chapter 5 addresses the deployment of the BDA, explaining the configuration process, which with the combination of a tool called Mammoth (appropriate really) and the lies of Puppet seems pretty simple as a lot of the solution is preconfigured on the box ready. all of which is reasonably well explained. my only grumble is that we do seem to revisit the details of the hardware fairly regularly as the details are again presented here, although we go into a deeper dive in the configuration. One surprise that I’d not picked up on is that Oracle have made their NoSQL solution available as open source, although a little digging might contribute to why as it has links back to Sleepycat’s BerkeleyDB that Oracle acquired (more here). As the chapter move through the physical aspects of the deployment it also highlights in clear terms any constraints Oracle imposes to ensure that the whole appliance is supportable, the most significant of these areas is the advanced networking that is setup.

Chapter 5 as it moves through deployment considerations addresses the means to know that the appliance is running properly – so we’re talking about system monitoring not just of the hardware but the distributed nature of Hadoop and Map Reduce. So a brief view of the products deployed is given. Obviously this centres on the Enterprise Manager extensions, but also the component level tooling such as Cloudera’s Hadoop Manager.

Chapter 6 in many respects continues building out the view of Hadoop to describe briefly the analytics tooling both in the Oracle RDBMS, R language and data mining/discovery of Endeca. The interesting points in the chapter are about the relationship with RDBMS particularly as an enterprise data warehouse – something I’ve not seen really addressed elsewhere as the common world view seems to put Hadoop in the same camp as NoSQL which seems to be gaining the zeal and polarity that Linux vs Windows used to have when it comes to RDBMS. But I think the book makes a good case for right tool for the right job.

Oracle’s Strategic Product View

Chapter 7 starts to drill in to how the connector package offers which consider Oracle database data transfer, combining the R language with MapReduce and ODI.

The database connector aim to provide efficiency in transferring data between Hadoop and the Oracle RDBMS over say using Sqoop to transfer data to and from an Oracle database (ODI connectors, JDBC, direct OCI etc). To fully understand the explanation of how this works you do need to understand the basics of MapReduce although as the chapter progresses the relevant MapReduce operations are elaborated upon. As the chapter progresses we start being shown configuration fragments for the different connection approaches.

The final chapter of this section of the book looks at the NoSQL database in detail, starting with high level ideas such as how NoSQL relates to ACID and BASE ideas, dropping down into significant (but valuable) detail by describing how clients are kept in sync through the use of separate threads picking up data about the data partitioning (sharding).  Once the key components have been well described the chapter moves onto explain how Oracle has optimized the process to make the NoSQL as performant as is possible whilst providing a solution that is elastic in nature and highly resilient but still predictable in its dynamics.

The chapter finishes off with considerations such as installation, how it integrates with Hadoop and OBIEE.

Overall, this a very informative chapter, occasionally it feels like some of the information is being repeated but in a different structure but it isn’t the end of the world, although if you’re reading from cover to cover you need to just press on.

 

Part 1 of the book is reviewed here.

Oracle Ace Director

TOGAF 9

Logging in Action

Oracle Cloud Integration Book

API Platform Book

Oracle Dev Meetup London

Categories

  • App Ideas
  • Books
    • Book Reviews
    • manning
    • Oracle Press
    • Packt
  • Enterprise architecture
  • General
    • economy
    • LinkedIn
    • Website
  • Music
    • Music Resources
    • Music Reviews
  • Photography
  • Technology
    • APIs & microservices
    • chatbots
    • Cloud
    • Dev Meetup
    • development
    • drone
    • FluentD
    • mindmap
    • OMESA
    • Oracle
      • API Platform CS
        • tools
      • Helidon
      • ITSO & OEAF
      • Java Cloud
      • NodeJS Cloud
      • OIC – ICS
    • TOGAF
    • UKOUG
  • xxRetired

Twitter

  • RT @WunderlichRd: Super excited to announce that API Gateway now provides response caching! #cloud #cloudnative #apigateway https://t.co/2…Next Tweet: 1 day ago
  • RT @conf42com: 🔴 ❝Making Logs Work for you with Fluentd❞ Register: conf42.com/Cloud_Native_2… by @mp3monster, Senior Consultant and #TechEvan…Next Tweet: 6 days ago
  • RT @HamillHimself: Smart dog. I only wish I'd thought of hiding behind the couch when I first met him.Next Tweet: 1 week ago
  • RT @WunderlichRd: When providing APIs, offering an SDK makes life much easier on your users. Oracle Cloud Infrastructure API Gateway now s…Next Tweet: 1 week ago
  • RT @HeliFromFinland: Thank you @javedmohammed @OracleSysDev for all the great interviews, your hard work, and your continuos support for th…Next Tweet: 1 week ago
Follow @mp3monster

OraWorld

OraWorld

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 578 other followers

Blogs I Follow

  • Rick's blog
  • A journey in development
  • Phil (aka MP3Monster)'s Blog
  • RedThunder.Blog
  • A millennial's musings
  • Shalindra's Blogs
  • BTplusMore
  • Creativenauts
  • PaaS Community Blog
  • RedStack
  • Musings of an Enterprise Software Technologist
  • The Open Group Blog
  • SutoCom Solutions
  • Rob's Wall Of Music
  • DataCentricSec.com
  • A World of Events

My Other Web Content & Contributions

  • All My Links
  • Amazon Author entry
  • API Platform
  • Dev Meetup (co-managed)
  • Fluentd Book
  • http://phil-wilkins.uk/
  • ICS Book Website
  • Mindmaps
  • Monster's Photos
  • my Capgemini Profile
  • OMESA
  • Oracle Community Directory
  • Packt Author Bio

RSS

RSS Feed RSS - Posts

RSS Feed RSS - Comments

Calendar

April 2014
M T W T F S S
 123456
78910111213
14151617181920
21222324252627
282930  
« Mar   May »

Other Pages

  • About
    • Presenting Activities
  • Books & Publications
    • API & API Platform
      • API Useful Resources
      • Useful Reading Sources
    • Logging in Action with Fluentd, Kubernetes and More
    • Oracle Integration
  • Mindmaps Index
    • Patterns Sources

Speaker Recognition

Open Source Summit Speaker

Flickr Pics

UKOUG volunteersBrightonBrightonBrighton
More Photos

History

Goodreads

OraNA

Aggregated by OraNA

Blogroll

  • A Journey in Development
  • A Neate Blog
  • Blog by Robert van Mölken (co-author on ICS book)
  • Exigency In Specie
  • Ora World
  • SOA4U

Social

  • View @mp3monster’s profile on Twitter
Follow Phil (aka MP3Monster)'s Blog on WordPress.com

Tags

6 Music Aaron Woody Ace AIA album Ansible API apiary API Platform applications article BBC Big Data blog book books Capgemini cd CEP Cloud code concert conference data Design developer development download ebook enterprise FluentD free fusion Good Morning Nantwich Groovy Helidon integration java JBoss jBPM London Luis Weir meetup Microservices mindmap monitoring Music OIC OIC - ICS OOW Oracle Oracle Press OTN PaaS Packt Packt Publishing Patterns Phill Jupitus playlist podcast Presentation promotion Puppet reading Redhat review Security SeeWhy SOA SOA Suite software Technology TOGAF UKOUG video

Blog at WordPress.com.

Rick's blog

End-to-End OIC to SAP integration

A journey in development

A blog-post by blog-post journey of a ERP Cloud Solutions Degree Apprentice

Phil (aka MP3Monster)'s Blog

from Technology to Music

RedThunder.Blog

Demystifying cloud technologies...

A millennial's musings

Shalindra's Blogs

Technofunctional Blogs

BTplusMore

Business, Technology and more

Creativenauts

Personal, design, inspiration, interests.

PaaS Community Blog

by Jürgen Kress

RedStack

Oracle Cloud Stuff

Musings of an Enterprise Software Technologist

My thoughts on Enterprise Software Technologies...and more.

The Open Group Blog

Achieving business objectives through technology standards

SutoCom Solutions

Success & Satisfaction with the Cloud

Rob's Wall Of Music

Thoughts of a lifelong music hoarder...

DataCentricSec.com

A World of Events

A Blog for Event and Data Analytics

Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.
To find out more, including how to control cookies, see here: Our Cookie Policy