19CrAYS
19CrAYS
19CrAYS
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
“She's tidied up and I can't find anything!<br />
all my tubes and wires<br />
And careful notes<br />
And antiquated notions<br />
but - it's poetry in motion<br />
She blinded me with science!"<br />
She Blinded Me With Science - Thomas Dolby
Eric Kavanagh is the host of DM Radio and Information<br />
Management's Webcasts. He is a veteran journalist and consultant<br />
with 20 years of experience in print, broadcast and online media. For<br />
the past 10 years, he has focused on enterprise technology and<br />
information management. Recently, he helped spur adoption of the<br />
Federal Funding Accountability and Transparency Act, which led to the<br />
deployment of USASpending.gov, a Web-based portal that provides<br />
detailed financial information about all federal contracts and grants.<br />
Prior to joining the magazine, Eric was web editor for TDWI. He also<br />
founded and runs Mobius Media, a strategic communications firm<br />
based in New Orleans. You can email him at<br />
dmradio@sourcemedia.com
Robin Bloor, President and Chief Analyst of The Bloor Group, has over 25<br />
years experience in software development and IT analysis and consulting.<br />
Robin is an influential and respected researcher and commentator on many<br />
corporate IT issues and strategies. His recent research has focused on<br />
virtualization, SOA, business intelligence, workload automation,<br />
communications enabled business processes, and the evolution of software<br />
tools.<br />
In addition to Robin’s deep technical expertise, he has covered these topic<br />
areas with a focus on end-use customer requirements. He is in great<br />
demand as a presenter at conferences, user groups and seminars addressing<br />
audiences across the world. Robin is a co-author of Service Oriented<br />
Architecture for Dummies, published by Wiley in 2007 and Cloud Computing<br />
for Dummies, published in 2009. He is also the author of The Electronic<br />
Bazaar published by Brearly in 2000.<br />
Robin is the founder of Bloor Research, an IT analyst company based in the<br />
U.K. In 2003, he was awarded an honorary PhD in Computer Science by<br />
Wolverhampton University in the United Kingdom, in recognition of<br />
“Services to the IT Industry.”
Dr. Geoffrey Malofsky Ph.D,<br />
Geoffrey Malafsky earned a PhD in Nanotechnology from The Pennsylvania State<br />
University. He was a research scientist at the Naval Research Laboratory before<br />
becoming a technology consultant in advanced system capabilities for numerous<br />
Government agencies and corporate clients. He has over thirty years of experience<br />
and is an expert in multiple fields including Nanotechnology, Knowledge Discovery<br />
and Dissemination, and information engineering. He founded and operated the<br />
technology consulting company TECHi2 prior to founding Phasic Systems Inc where<br />
he is the CEO. He has authored numerous articles on semantic technology, agile<br />
governance and other related topics and is regularly solicited to speak at industry<br />
conferences, industry radio programs and is widely regarded as a leading<br />
technologist in the semantic technology field
J. Kevin Moran<br />
Vice Admiral J. Kevin Moran was the COO at The Investor Relations Group following a<br />
highly distinguished thirty-two years in the U.S. Navy. He graduated from the United<br />
States Naval Academy with a Bachelor of Science Degree in Oceanography/Physics<br />
and went on to flight training where he was designated a Naval Aviator. He was<br />
Deputy Chief of Naval Personnel where he led the effort to modernize and integrate<br />
the Manpower, Personnel, Training, and Education organization, systems, and data<br />
environments. He is a graduate of the Naval War College in Newport, Rhode Island,<br />
where he earned a Master of Arts degree in International Relations and Strategic<br />
Studies. He is also a graduate of the Advanced Management Program at the Harvard<br />
School of Business. His personal awards include the Navy Distinguished Service<br />
Medal, the Legion of Merit (gold star in lieu of fourth award), the Defense Meritorious<br />
Service Medal, the Meritorious Service Medal (gold star in lieu of third award), the<br />
Navy Commendation Medal (gold star in lieu of second award). Additionally, while he<br />
was the Commander of the Naval Education and Training Command, he was awarded<br />
the Elliot Masie Foundation Pioneer Award for visionary training and education<br />
solutions; US News and World Report selected his command as one of the best places<br />
to work in America; and he was awarded the Ellis Island Medal of Honor as a<br />
distinguished first generation American.
Dr. Anand S. Rao is a Principal in PwC's Advisory practice, focused on the financial<br />
services industry. Anand joined PwC as a direct-admit partner from Diamond<br />
Management Technology Consultants in 2010. He has been with PwC/Diamond<br />
since 2001 helping senior executives structure, solve and manage critical issues<br />
facing their organizations. With over 21 years of industry and consulting<br />
experience, Anand has worked extensively on both business and technology issues<br />
across a wide range of industry sectors including, financial services (insurance,<br />
retail, corporate, and investment banking, payments), telecommunications (mobile<br />
and fixed-line), healthcare (payer), aerospace, retail, and resource sectors.<br />
Anand's work has included behavioral economics, simulation modeling, global<br />
growth strategies, marketing, sales, and distribution strategies, online, mobile,<br />
social media strategies, segmentation and proposition development, customer<br />
value management, multi-channel integration, risk management and compliance<br />
(specifically, Basel II), large scale program mobilization and program management.<br />
He has worked and lived in four countries spanning four continents and has<br />
consulted to clients in fifteen countries across six continents.
Phasic Systems Inc.<br />
www.phasicsystemsinc.com<br />
703-945-1378
“Many CIOs believe data is inexpensive because storage has become<br />
inexpensive. But data is inherently messy – it can be wrong, it can be<br />
duplicative, and it can be irrelevant – which means it requires handling,<br />
which is where the real expenses come in. „The cost of more data is the<br />
application and the computing power and the processes to reconcile all<br />
these things‟,”<br />
"While there are a myriad of analytical tools that can be leveraged, a<br />
recent study indicated that more than 70% of CMOs feel they are<br />
underprepared to manage the explosion of data and „lack true insight.‟ “<br />
1. Wall Street Journal, CIO‘s Big Problem with Big Data, 2012-08-02<br />
2. Forbes, The CEO/CMO Dilemma: So Much Data, So Little Impact, 2012-07-18<br />
9
Based on non-IT, well-established, technical field and methods<br />
Has potential to solve intractable, complex challenges but also<br />
vulnerable to same forces that blocked prior ‘movements’<br />
Science is structured, highly analytical with a culture of<br />
questioning assumptions and requiring objective proof<br />
Science is based on big data and handles it every day in every<br />
Science knows how to manage distributed sources, differing<br />
semantics, quality problems, linking business needs to<br />
operational data<br />
But, this is a culture shift from traditional data management<br />
New ideas: raw data, calibration transforms, uncertainty,<br />
continuous assessment and improvement, willingness to admit<br />
errors and immediately fix them 10
Example from DARPA Evidence<br />
Extraction & Link Discovery<br />
Today’s Situation: ~10k<br />
messages/day from multiple<br />
sources read by multiple<br />
analysts and analyzed in<br />
multiple manual non-integrated<br />
tools<br />
Similar to Social Network<br />
Analysis<br />
11
“We spent years and many millions of dollars using the best technology and<br />
people and always failed to reconcile our data across major functional areas.<br />
This seriously impeded our efforts to improve Navy training, personnel<br />
placement, and planning to meet requirements for greater efficiency and<br />
adaptability. Only the rationalization process solved this challenge and<br />
amazingly did so with broad consensus across groups.” VADM (Ret) K. Moran<br />
14
Challenge: Complicated environment with conflicting data<br />
values, standards, business uses cases, and lack of<br />
documentation. Data owned by 4 major organization, in multiple<br />
Warehouses and data stores, redundant non-reconciled sets of<br />
data<br />
Requirement: Integrated, common, accurate data to enable new<br />
Integrated workforce planning, training, management application<br />
(“Sailor of the Future”) for 1 million people<br />
Prior Activities: 10+ years of system integration, data warehouse,<br />
data governance efforts no improvement, poor coordination<br />
across organizations and systems<br />
15
The US Navy struggled for years with creating a common,<br />
trustworthy, governance directed warehouse of Human<br />
Resources data.<br />
The data was spread out in five major warehouses and dozens<br />
of small databases supporting many applications.<br />
The applications spanned decades of business logic and<br />
technology, including old mainframe COBOL programs through<br />
modern Services Oriented Architecture (SOA) web service<br />
based tools.<br />
Many projects and years were spent without success despite<br />
extensive analysis, integrated project teams, and highly<br />
experienced personnel.<br />
16
This was the fundamental problem preventing all<br />
previous efforts, namely the lack of business context<br />
and the different perspectives among user groups and<br />
applications on what each data element really meant.<br />
This problem could not be overcome using traditional<br />
governance, data modeling, and warehousing<br />
techniques because they require knowledge of all<br />
specific requirements, definitions, and use cases before<br />
data modeling and engineering.<br />
17
US Navy: VADM (Ret) Moran, executive<br />
US Navy: Tim Traverso, former Technical Director<br />
OPNAV, N6<br />
US Navy: Jerry Best, project lead and SME<br />
Contractor: Dr. Geoffrey Malafsky, chief technical<br />
designer and development leader<br />
19
With one million records covering over fifty years<br />
of data, the data values supported many older<br />
business processes and policies as well as new HR<br />
standards.<br />
This created the situation where data conflicted<br />
within each warehouse and across warehouses<br />
even when the data was supposedly the same<br />
job title, salary, address<br />
A significant number of people had multiple<br />
records listed as currently active with wildly<br />
different values.<br />
21
The data encompassed all HR data including:<br />
personnel, training, and manpower analysis and<br />
planning.<br />
It included critical job assignments, career path, job<br />
performance, qualifications, field deployments, and<br />
alignment to enterprise business capabilities.<br />
It spanned decades of changing data models,<br />
definitions, value spaces, and Navy strategies.<br />
22
Yet, there were problems with the most basic<br />
data fields, which for the Navy, include things<br />
like<br />
billet (effectively a job but also includes other<br />
characteristics),<br />
rank (similar to seniority but with formal rules that change<br />
over time),<br />
rating (similar to vocational ability but also with changing<br />
rules),<br />
and even the primary identifier of a person the Social<br />
Security Number (SSN).<br />
23
There were many types of conflicts.<br />
The Personnel warehouse contained Training data because the<br />
Personnel group did not trust the data from what was<br />
supposed to the authoritative warehouse managed by the<br />
Training group.<br />
Similarly, the Training warehouse contained personnel data<br />
because they did not trust the quality of the data managed by<br />
the Personnel group.<br />
These data sets did not agree preventing applications from<br />
creating unified functions, and causing erroneous reports and<br />
analytics that had to be manually corrected on a recurring<br />
basis.<br />
24
The solution was to use Phasic Systems Inc.’s Rapid<br />
Rationalization Process.<br />
The first major advance was to recognize that there<br />
are multiple valid definitions of the key business<br />
concepts (e.g. job, billet, address).<br />
The method accepts and accommodates all variations<br />
by placing them into the predefined semantic<br />
framework, and identifying exactly how they are used<br />
and why.<br />
In this manner, organizational disagreements ceased,<br />
key knowledge was captured, and common data<br />
standards were defined in hours and days instead of<br />
26<br />
months and years.
Data Rationalization is the process of building and managing a<br />
continuously adaptive data environment that fuels current and<br />
future business needs for decision making and system<br />
operations<br />
It ensures data (i.e. not just metadata) is as accurate,<br />
meaningful, and useful as possible while continuously adjusting<br />
to improve and add capability<br />
It provides collaborative management of data assets, the<br />
designs governing who, why, and how of data , and the where,<br />
when, how of data use in operational systems<br />
It solves the great challenge of mapping all source values to<br />
each target along the entire complex paths of enterprise data<br />
use<br />
Consolidated values when possible with continuous improvement<br />
Simplified and adaptive mapping with Corporate NoSQL 28
Tailored for real environments with complex or<br />
undocumented business/technical activities, models, rules<br />
Use standards-based simple organization-process-technology<br />
model to capture key corporate knowledge from all sources<br />
at any stage of design & operation<br />
List-based entry of intuitive concepts<br />
People according to job and experience<br />
Documents of any kind extracted and correlated<br />
Reverse engineer systems and databases<br />
Produce design and ready-to-implement results<br />
30-60 days<br />
Correlated, authoritative, consensus approved system<br />
models, data models, codes, glossaries, rules<br />
29
Design Rationalization Issues<br />
• Multiple data models<br />
• Conflicting definitions<br />
• Similar, supposedly similar, operationally<br />
distinct values<br />
• Unknown business logic<br />
• Multiple ETL mappings<br />
Design Rationalization<br />
• Consolidated, adaptive data models<br />
• Standardized definitions<br />
• Synchronized distinct operational values<br />
• Managed business logic<br />
• Coordinated ETL mappings<br />
System Rationalization Issues<br />
• Multiple database systems<br />
• Conflicting formats<br />
• Redundant storage<br />
• Unsynchronized values<br />
• Multiple integration points<br />
• System performance<br />
System Rationalization<br />
• Consolidated, adaptive systems<br />
• Common, interoperable formats<br />
• Common storage<br />
• Synchronized interfaces<br />
• Coordinated integration<br />
• Greater system performance<br />
30
Corporate NoSQL<br />
31
Suffix in source A, prefix in B, neither in C for same (part<br />
number, title, …)?<br />
Conflict syntactically (simplest case) and semantically (most<br />
difficult)<br />
Other tools & methods never solve this because they deal with<br />
the obstacles independently or not at all<br />
Data values are out-of-synch with metadata, data models, BI warehouse<br />
Different Meanings (Legal and Business Activities)<br />
NKY HomeSeekers Texas<br />
32
The Ψ–KORS System Model<br />
35
Key terms used for decades but debated across organizations and used<br />
differently in data stores. PSI clarified, aligned to standards, and defined<br />
distinct versions as well as common enterprise version<br />
Rating – code of occupational specialty<br />
Rate – rating with pay grade code appended<br />
Billet – roughly associated with a job but used for projections in<br />
manpower, salary in personnel, and classroom seats in training<br />
Task – PSI identified and defined 3 unique varieties: 1)operational<br />
mission, 2) job activities, 3) training curriculum segments<br />
Key data entities stored with different and undocumented business<br />
logic leading to conflicting values and organizational disagreements<br />
Navy Enlisted Classification (NEC) – important value used in most HR<br />
processes and systems. Ad hoc logic adopted over time as<br />
authoritative but in fact not aligned to regulations and changed<br />
36<br />
frequently
Current data is disjointed and of low quality<br />
Variable use and meaning among systems even for “same” data<br />
elements<br />
Undocumented definitions and data mgmt processes<br />
Errors in data systems<br />
Disagreement among data systems<br />
Lack of existing descriptions for key readiness use cases<br />
Legacy data systems have failed to overcome these<br />
problems despite several years of new<br />
marts/houses/brokers/IPTs/applications
Multiple Rate/Rating entries that conflict per person.<br />
5 entries with 4 ending on the same date (1992-06-30); 2/5 have<br />
start dates after they their end dates (e.g. 1992-07-01, 1992-10-<br />
16); 2 start and end on the same days but have different rates<br />
Some data systems confuse Rate and Rating<br />
NEC is critical value shared across systems but logic varies<br />
Local manager, “…When we apply an NEC, we look up the priority on a<br />
table, and shuffle the 5 NECs around as needed.”<br />
Manpower: A_RTABBR is empty for 11919 records.<br />
R_RTABBR has values but it uses non-standard Rate codes<br />
that are meant to show a range of paygrades
Complicated Mixture of Commercial, Custom, Legacy, Services Applications, Data Stores<br />
39<br />
Copyright Phasic Systems Inc 2013
Navy HPC<br />
HR-XML<br />
40
Bridge Organizations, Processes, Technologies to Data Concepts<br />
41
Logical Models derive directly from conceptual and use business terms<br />
43
The new data was widely acknowledged as the only<br />
completely integrated, accurate data by all levels of<br />
the organization.<br />
The approach was so successful that the governance<br />
board adopted the name of the new enterprise data<br />
model (Position-Resume) as its own name.<br />
Meets to this day on a monthly basis<br />
Usually garners high consensus without friction<br />
46
Different Meanings (Legal and Business Activities)<br />
NKY HomeSeekers Texas<br />
Example solution:<br />
1. Create table – title aligned to business = Garage<br />
2. Create vocabulary for distinct use cases system, value analysis, business use =<br />
(spaces, spaces.description, spaces.national, spaces.state, listingservice, ….)<br />
3. Define ETL logic<br />
4. Merge in warehouse and process in virtualization layer<br />
5. Change as needed<br />
47
Costs<br />
Business Alignment: Goal, Capability, Architecture<br />
Data Assets: Systems, Owners, Use<br />
48