04.06.2013 Views

19CrAYS

19CrAYS

19CrAYS

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

“She's tidied up and I can't find anything!<br />

all my tubes and wires<br />

And careful notes<br />

And antiquated notions<br />

but - it's poetry in motion<br />

She blinded me with science!"<br />

She Blinded Me With Science - Thomas Dolby


Eric Kavanagh is the host of DM Radio and Information<br />

Management's Webcasts. He is a veteran journalist and consultant<br />

with 20 years of experience in print, broadcast and online media. For<br />

the past 10 years, he has focused on enterprise technology and<br />

information management. Recently, he helped spur adoption of the<br />

Federal Funding Accountability and Transparency Act, which led to the<br />

deployment of USASpending.gov, a Web-based portal that provides<br />

detailed financial information about all federal contracts and grants.<br />

Prior to joining the magazine, Eric was web editor for TDWI. He also<br />

founded and runs Mobius Media, a strategic communications firm<br />

based in New Orleans. You can email him at<br />

dmradio@sourcemedia.com


Robin Bloor, President and Chief Analyst of The Bloor Group, has over 25<br />

years experience in software development and IT analysis and consulting.<br />

Robin is an influential and respected researcher and commentator on many<br />

corporate IT issues and strategies. His recent research has focused on<br />

virtualization, SOA, business intelligence, workload automation,<br />

communications enabled business processes, and the evolution of software<br />

tools.<br />

In addition to Robin’s deep technical expertise, he has covered these topic<br />

areas with a focus on end-use customer requirements. He is in great<br />

demand as a presenter at conferences, user groups and seminars addressing<br />

audiences across the world. Robin is a co-author of Service Oriented<br />

Architecture for Dummies, published by Wiley in 2007 and Cloud Computing<br />

for Dummies, published in 2009. He is also the author of The Electronic<br />

Bazaar published by Brearly in 2000.<br />

Robin is the founder of Bloor Research, an IT analyst company based in the<br />

U.K. In 2003, he was awarded an honorary PhD in Computer Science by<br />

Wolverhampton University in the United Kingdom, in recognition of<br />

“Services to the IT Industry.”


Dr. Geoffrey Malofsky Ph.D,<br />

Geoffrey Malafsky earned a PhD in Nanotechnology from The Pennsylvania State<br />

University. He was a research scientist at the Naval Research Laboratory before<br />

becoming a technology consultant in advanced system capabilities for numerous<br />

Government agencies and corporate clients. He has over thirty years of experience<br />

and is an expert in multiple fields including Nanotechnology, Knowledge Discovery<br />

and Dissemination, and information engineering. He founded and operated the<br />

technology consulting company TECHi2 prior to founding Phasic Systems Inc where<br />

he is the CEO. He has authored numerous articles on semantic technology, agile<br />

governance and other related topics and is regularly solicited to speak at industry<br />

conferences, industry radio programs and is widely regarded as a leading<br />

technologist in the semantic technology field


J. Kevin Moran<br />

Vice Admiral J. Kevin Moran was the COO at The Investor Relations Group following a<br />

highly distinguished thirty-two years in the U.S. Navy. He graduated from the United<br />

States Naval Academy with a Bachelor of Science Degree in Oceanography/Physics<br />

and went on to flight training where he was designated a Naval Aviator. He was<br />

Deputy Chief of Naval Personnel where he led the effort to modernize and integrate<br />

the Manpower, Personnel, Training, and Education organization, systems, and data<br />

environments. He is a graduate of the Naval War College in Newport, Rhode Island,<br />

where he earned a Master of Arts degree in International Relations and Strategic<br />

Studies. He is also a graduate of the Advanced Management Program at the Harvard<br />

School of Business. His personal awards include the Navy Distinguished Service<br />

Medal, the Legion of Merit (gold star in lieu of fourth award), the Defense Meritorious<br />

Service Medal, the Meritorious Service Medal (gold star in lieu of third award), the<br />

Navy Commendation Medal (gold star in lieu of second award). Additionally, while he<br />

was the Commander of the Naval Education and Training Command, he was awarded<br />

the Elliot Masie Foundation Pioneer Award for visionary training and education<br />

solutions; US News and World Report selected his command as one of the best places<br />

to work in America; and he was awarded the Ellis Island Medal of Honor as a<br />

distinguished first generation American.


Dr. Anand S. Rao is a Principal in PwC's Advisory practice, focused on the financial<br />

services industry. Anand joined PwC as a direct-admit partner from Diamond<br />

Management Technology Consultants in 2010. He has been with PwC/Diamond<br />

since 2001 helping senior executives structure, solve and manage critical issues<br />

facing their organizations. With over 21 years of industry and consulting<br />

experience, Anand has worked extensively on both business and technology issues<br />

across a wide range of industry sectors including, financial services (insurance,<br />

retail, corporate, and investment banking, payments), telecommunications (mobile<br />

and fixed-line), healthcare (payer), aerospace, retail, and resource sectors.<br />

Anand's work has included behavioral economics, simulation modeling, global<br />

growth strategies, marketing, sales, and distribution strategies, online, mobile,<br />

social media strategies, segmentation and proposition development, customer<br />

value management, multi-channel integration, risk management and compliance<br />

(specifically, Basel II), large scale program mobilization and program management.<br />

He has worked and lived in four countries spanning four continents and has<br />

consulted to clients in fifteen countries across six continents.


Phasic Systems Inc.<br />

www.phasicsystemsinc.com<br />

703-945-1378


“Many CIOs believe data is inexpensive because storage has become<br />

inexpensive. But data is inherently messy – it can be wrong, it can be<br />

duplicative, and it can be irrelevant – which means it requires handling,<br />

which is where the real expenses come in. „The cost of more data is the<br />

application and the computing power and the processes to reconcile all<br />

these things‟,”<br />

"While there are a myriad of analytical tools that can be leveraged, a<br />

recent study indicated that more than 70% of CMOs feel they are<br />

underprepared to manage the explosion of data and „lack true insight.‟ “<br />

1. Wall Street Journal, CIO‘s Big Problem with Big Data, 2012-08-02<br />

2. Forbes, The CEO/CMO Dilemma: So Much Data, So Little Impact, 2012-07-18<br />

9


Based on non-IT, well-established, technical field and methods<br />

Has potential to solve intractable, complex challenges but also<br />

vulnerable to same forces that blocked prior ‘movements’<br />

Science is structured, highly analytical with a culture of<br />

questioning assumptions and requiring objective proof<br />

Science is based on big data and handles it every day in every<br />

Science knows how to manage distributed sources, differing<br />

semantics, quality problems, linking business needs to<br />

operational data<br />

But, this is a culture shift from traditional data management<br />

New ideas: raw data, calibration transforms, uncertainty,<br />

continuous assessment and improvement, willingness to admit<br />

errors and immediately fix them 10


Example from DARPA Evidence<br />

Extraction & Link Discovery<br />

Today’s Situation: ~10k<br />

messages/day from multiple<br />

sources read by multiple<br />

analysts and analyzed in<br />

multiple manual non-integrated<br />

tools<br />

Similar to Social Network<br />

Analysis<br />

11


“We spent years and many millions of dollars using the best technology and<br />

people and always failed to reconcile our data across major functional areas.<br />

This seriously impeded our efforts to improve Navy training, personnel<br />

placement, and planning to meet requirements for greater efficiency and<br />

adaptability. Only the rationalization process solved this challenge and<br />

amazingly did so with broad consensus across groups.” VADM (Ret) K. Moran<br />

14


Challenge: Complicated environment with conflicting data<br />

values, standards, business uses cases, and lack of<br />

documentation. Data owned by 4 major organization, in multiple<br />

Warehouses and data stores, redundant non-reconciled sets of<br />

data<br />

Requirement: Integrated, common, accurate data to enable new<br />

Integrated workforce planning, training, management application<br />

(“Sailor of the Future”) for 1 million people<br />

Prior Activities: 10+ years of system integration, data warehouse,<br />

data governance efforts no improvement, poor coordination<br />

across organizations and systems<br />

15


The US Navy struggled for years with creating a common,<br />

trustworthy, governance directed warehouse of Human<br />

Resources data.<br />

The data was spread out in five major warehouses and dozens<br />

of small databases supporting many applications.<br />

The applications spanned decades of business logic and<br />

technology, including old mainframe COBOL programs through<br />

modern Services Oriented Architecture (SOA) web service<br />

based tools.<br />

Many projects and years were spent without success despite<br />

extensive analysis, integrated project teams, and highly<br />

experienced personnel.<br />

16


This was the fundamental problem preventing all<br />

previous efforts, namely the lack of business context<br />

and the different perspectives among user groups and<br />

applications on what each data element really meant.<br />

This problem could not be overcome using traditional<br />

governance, data modeling, and warehousing<br />

techniques because they require knowledge of all<br />

specific requirements, definitions, and use cases before<br />

data modeling and engineering.<br />

17


US Navy: VADM (Ret) Moran, executive<br />

US Navy: Tim Traverso, former Technical Director<br />

OPNAV, N6<br />

US Navy: Jerry Best, project lead and SME<br />

Contractor: Dr. Geoffrey Malafsky, chief technical<br />

designer and development leader<br />

19


With one million records covering over fifty years<br />

of data, the data values supported many older<br />

business processes and policies as well as new HR<br />

standards.<br />

This created the situation where data conflicted<br />

within each warehouse and across warehouses<br />

even when the data was supposedly the same<br />

job title, salary, address<br />

A significant number of people had multiple<br />

records listed as currently active with wildly<br />

different values.<br />

21


The data encompassed all HR data including:<br />

personnel, training, and manpower analysis and<br />

planning.<br />

It included critical job assignments, career path, job<br />

performance, qualifications, field deployments, and<br />

alignment to enterprise business capabilities.<br />

It spanned decades of changing data models,<br />

definitions, value spaces, and Navy strategies.<br />

22


Yet, there were problems with the most basic<br />

data fields, which for the Navy, include things<br />

like<br />

billet (effectively a job but also includes other<br />

characteristics),<br />

rank (similar to seniority but with formal rules that change<br />

over time),<br />

rating (similar to vocational ability but also with changing<br />

rules),<br />

and even the primary identifier of a person the Social<br />

Security Number (SSN).<br />

23


There were many types of conflicts.<br />

The Personnel warehouse contained Training data because the<br />

Personnel group did not trust the data from what was<br />

supposed to the authoritative warehouse managed by the<br />

Training group.<br />

Similarly, the Training warehouse contained personnel data<br />

because they did not trust the quality of the data managed by<br />

the Personnel group.<br />

These data sets did not agree preventing applications from<br />

creating unified functions, and causing erroneous reports and<br />

analytics that had to be manually corrected on a recurring<br />

basis.<br />

24


The solution was to use Phasic Systems Inc.’s Rapid<br />

Rationalization Process.<br />

The first major advance was to recognize that there<br />

are multiple valid definitions of the key business<br />

concepts (e.g. job, billet, address).<br />

The method accepts and accommodates all variations<br />

by placing them into the predefined semantic<br />

framework, and identifying exactly how they are used<br />

and why.<br />

In this manner, organizational disagreements ceased,<br />

key knowledge was captured, and common data<br />

standards were defined in hours and days instead of<br />

26<br />

months and years.


Data Rationalization is the process of building and managing a<br />

continuously adaptive data environment that fuels current and<br />

future business needs for decision making and system<br />

operations<br />

It ensures data (i.e. not just metadata) is as accurate,<br />

meaningful, and useful as possible while continuously adjusting<br />

to improve and add capability<br />

It provides collaborative management of data assets, the<br />

designs governing who, why, and how of data , and the where,<br />

when, how of data use in operational systems<br />

It solves the great challenge of mapping all source values to<br />

each target along the entire complex paths of enterprise data<br />

use<br />

Consolidated values when possible with continuous improvement<br />

Simplified and adaptive mapping with Corporate NoSQL 28


Tailored for real environments with complex or<br />

undocumented business/technical activities, models, rules<br />

Use standards-based simple organization-process-technology<br />

model to capture key corporate knowledge from all sources<br />

at any stage of design & operation<br />

List-based entry of intuitive concepts<br />

People according to job and experience<br />

Documents of any kind extracted and correlated<br />

Reverse engineer systems and databases<br />

Produce design and ready-to-implement results<br />

30-60 days<br />

Correlated, authoritative, consensus approved system<br />

models, data models, codes, glossaries, rules<br />

29


Design Rationalization Issues<br />

• Multiple data models<br />

• Conflicting definitions<br />

• Similar, supposedly similar, operationally<br />

distinct values<br />

• Unknown business logic<br />

• Multiple ETL mappings<br />

Design Rationalization<br />

• Consolidated, adaptive data models<br />

• Standardized definitions<br />

• Synchronized distinct operational values<br />

• Managed business logic<br />

• Coordinated ETL mappings<br />

System Rationalization Issues<br />

• Multiple database systems<br />

• Conflicting formats<br />

• Redundant storage<br />

• Unsynchronized values<br />

• Multiple integration points<br />

• System performance<br />

System Rationalization<br />

• Consolidated, adaptive systems<br />

• Common, interoperable formats<br />

• Common storage<br />

• Synchronized interfaces<br />

• Coordinated integration<br />

• Greater system performance<br />

30


Corporate NoSQL<br />

31


Suffix in source A, prefix in B, neither in C for same (part<br />

number, title, …)?<br />

Conflict syntactically (simplest case) and semantically (most<br />

difficult)<br />

Other tools & methods never solve this because they deal with<br />

the obstacles independently or not at all<br />

Data values are out-of-synch with metadata, data models, BI warehouse<br />

Different Meanings (Legal and Business Activities)<br />

NKY HomeSeekers Texas<br />

32


The Ψ–KORS System Model<br />

35


Key terms used for decades but debated across organizations and used<br />

differently in data stores. PSI clarified, aligned to standards, and defined<br />

distinct versions as well as common enterprise version<br />

Rating – code of occupational specialty<br />

Rate – rating with pay grade code appended<br />

Billet – roughly associated with a job but used for projections in<br />

manpower, salary in personnel, and classroom seats in training<br />

Task – PSI identified and defined 3 unique varieties: 1)operational<br />

mission, 2) job activities, 3) training curriculum segments<br />

Key data entities stored with different and undocumented business<br />

logic leading to conflicting values and organizational disagreements<br />

Navy Enlisted Classification (NEC) – important value used in most HR<br />

processes and systems. Ad hoc logic adopted over time as<br />

authoritative but in fact not aligned to regulations and changed<br />

36<br />

frequently


Current data is disjointed and of low quality<br />

Variable use and meaning among systems even for “same” data<br />

elements<br />

Undocumented definitions and data mgmt processes<br />

Errors in data systems<br />

Disagreement among data systems<br />

Lack of existing descriptions for key readiness use cases<br />

Legacy data systems have failed to overcome these<br />

problems despite several years of new<br />

marts/houses/brokers/IPTs/applications


Multiple Rate/Rating entries that conflict per person.<br />

5 entries with 4 ending on the same date (1992-06-30); 2/5 have<br />

start dates after they their end dates (e.g. 1992-07-01, 1992-10-<br />

16); 2 start and end on the same days but have different rates<br />

Some data systems confuse Rate and Rating<br />

NEC is critical value shared across systems but logic varies<br />

Local manager, “…When we apply an NEC, we look up the priority on a<br />

table, and shuffle the 5 NECs around as needed.”<br />

Manpower: A_RTABBR is empty for 11919 records.<br />

R_RTABBR has values but it uses non-standard Rate codes<br />

that are meant to show a range of paygrades


Complicated Mixture of Commercial, Custom, Legacy, Services Applications, Data Stores<br />

39<br />

Copyright Phasic Systems Inc 2013


Navy HPC<br />

HR-XML<br />

40


Bridge Organizations, Processes, Technologies to Data Concepts<br />

41


Logical Models derive directly from conceptual and use business terms<br />

43


The new data was widely acknowledged as the only<br />

completely integrated, accurate data by all levels of<br />

the organization.<br />

The approach was so successful that the governance<br />

board adopted the name of the new enterprise data<br />

model (Position-Resume) as its own name.<br />

Meets to this day on a monthly basis<br />

Usually garners high consensus without friction<br />

46


Different Meanings (Legal and Business Activities)<br />

NKY HomeSeekers Texas<br />

Example solution:<br />

1. Create table – title aligned to business = Garage<br />

2. Create vocabulary for distinct use cases system, value analysis, business use =<br />

(spaces, spaces.description, spaces.national, spaces.state, listingservice, ….)<br />

3. Define ETL logic<br />

4. Merge in warehouse and process in virtualization layer<br />

5. Change as needed<br />

47


Costs<br />

Business Alignment: Goal, Capability, Architecture<br />

Data Assets: Systems, Owners, Use<br />

48

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!