Modified by Scott Blau firstname.lastname@example.org
I have been involved in document capture for 26 years - since well before it was even called "capture." I am often asked about how I came to found Datacap. So taking advantage of my impending exit from the stage (read on for more!), I thought I would share some background on the founding of Datacap as the first true document capture software company.
The microprocessor revolution was in full swing in the mid-1980s when I joined a hospital information systems startup. We concentrated on data collection in operating rooms using what was then cutting edge: IBM PCs networked directly together with TCP/IP. We got sterilized computers into the operating rooms... but we struggled to get nurses to use them.
Turns out that not only was the nursing staff more comfortable working with pen and paper (usually on a clipboard), but they resented having to literally turn their backs on the patient - particularly when they were being asked to enter what was essentially inventory data for billing purposes (the materials and instruments being used during the course of the operation). We also had a back up system in case the PCs or the network didn't work (which was often). It used... paper.
It was in this context that I was struck down with a very nasty virus: chicken pox. You may think of chicken pox as a childhood disease with some discomfort - or even as a great opportunity to play hooky from school - but adult chicken pox, as I quickly learned, is an entirely different beast. I ran a high fever, and for several days I literally could not raise my head off the pillow.
At times I became delirious. And it was in a delirium that Datacap was born. I had visions of paper forms filled in by nurses in the OR dancing around... but, more interestingly, of the data on the pages coming unstuck from the paper and floating off. I saw individual characters very clearly and how they could be segmented to be understood by the computer as data.
Once the fever had burned itself out, I got back to work. One day not long after, a guy I had brought in to help us with some of our tougher user interface challenges showed off an early document scanner. He was building a driver to run the scanner from a Mac for another client of his. It seemed an amazing piece of equipment... and it struck me right there that if we could scan the nurses' sheets, then we could segment the characters and turn paper into data!
Once the wheels started spinning, there was no turning back. Along with Noel Kropf, the guy with the scanner, we founded Datacap and we set to work building our first product, Paper Keyboard, releasing it in 1989. It all seems so inevitable now, but at the time there were nothing but hurdles to overcome: porting to Windows 3.0, adding machine-print OCR, tying multiple machines together to distribute the work, etc., etc.. It kept a growing team of developers busy for the next two and a half decades.
And, of course, we faced increasing competition as the "forms processing" business became a recognized speciality in the quickly gowing document imaging industry. Some vendors edged into scanning and data entry automation from related areas like manual data entry (Textware, later Captiva), while some started the long transition from hardware to software (Kofax).
Eventually, Datacap became part of IBM in 2010, giving us the opportunity to put down a global footprint. What started off literally as a “vision” in a fever, has become a global reality, used by customers worldwide to ingest millions of pages each day. In some ways, for me, that fever never passed. It has energized me for years – and I like to think I have “infected” a few others. If so, then maybe my job is done. I can let a new generation of visionaries take what we have done with IBM Datacap to a new level. That's part of the reason why I decided to step aside from my current role into retirement at the end of this month, not long after I push the "Publish" button on this blog.
I still have a vision for the future of document capture, one that is increasingly mobile and distributed, and one that will make the steady transition from "on-premise" to SaaS for many customers. But I'm sharing it with you now so that you can help make it a reality while I spend some more time on my bike and doing the many things that I haven't had the opportunity to do since I first caught the Datacap bug!
Whatever I’ve done, I’d like to thank the hundreds, and probably thousands of individuals that have made the document capture space a thriving business arena. Whether you were with Datacap, with IBM, with one of our many partners, or even with a competitor (ha, I know you are reading this!), without all of you, we would not have such a vibrant and successful capture community!
Scott - @CaptureGuru
Modified by Scott Blau email@example.com
As a continuation from my previous post, here are some fundamental questions to ask yourself - and others - as you embark on a distributed capture endeavor:
Is it “Usable?”
An intuitive user-interface is essential to facilitate distributed capture. Typically, the people receiving documents are customer-facing, not dedicated and trained capture operators. The solution should provide a clear and simple series of steps to that assure a legible document image…
Can it be “Read?”
A poor image quality or, worse, partially-captured document, will quickly undermine the benefits of distributed capture, especially downstream when it comes time to extract data with optical character recognition (OCR). This is where most mobile telephone cameras struggle to create high enough quality images to avoid laborious manual effort later in the process. For a step up in quality, select a portable scanner – some are no larger than a thick ruler – that attaches to a laptop or mobile device.
What document is it?
The first, most important, piece of information about any scan, is the identity of the document itself. Is it an application, a claim, a change-of-address, etc? That question might be answered by manual input from the person who scanned or took the picture of the document, but it also might be automated through automatic document classification. Remember, your mobile and distributed workforce are not trained capture professionals, so take a belt and suspenders strategy on this one…
Is it Accurate?
Determining the accuracy of content extracted from a document is of prime importance. Whether the extraction is manual, or automated with OCR, you need a set of checks and balances to assure users that the solution can be relied upon. For example, if the software is uncertain, how does it notify a user, and which user is it that gets notified?
Is it Safe?
The security of data is essential to consider, especially when handling customer or other sensitive data. Distributed capture must be considered moving capture into high-risk environments. Make sure you understand what the risk exposure is if a mobile device is lost or stolen in the field.
Is it Faster?
The speed at which the captured document is transferred from the mobile device to your repository or LOB system determines the speed at which it can be processed by the application. The old saying, “a chain is only as strong as its weakest link,” comes into play here. If there is, in fact, a bandwidth limitation for remote users, then the advantages of capturing remotely may be lost in the transfer.
Is it Capable of Handling Anything a User Throws at it?
There are always exceptions and how you manage them is the test of a capture system. Can you add attachments? Can you add a new document you weren’t expecting? Can you annotate a document or route it to a supervisor for review? The closer you are to the customer, the more exceptions you will encounter, so make sure you have the flexibility to handle the unexpected.
Will it work for me?
In most cases, a mobile capture solution will both archive the document images, and route them into a line of business system – as fast as possible for customer satisfaction. For example, an invoice, resume, or contract will be sent to the ERP system. An insurance claim will be forwarded for adjudication. A loan application may link to a case management system, where underwriters will review. A medical document will be appended to the patient’s electronic health record. Make sure your distributed capture system can connect to your business systems and deliver image and data seamlessly.
After all these years in the capture business, I thought things had settled down. People have been saying that document capture is a “mature” technology. And, of course, it is, but the world is changing around us, creating new opportunities. So don’t be shy: if you see a way to shorten the cycles, to deliver better customer service, to improve vendor relations, or to change just about any existing process by capturing documents sooner at distributed/remote locations, then take advantage of the opportunity. Just ask the right questions - and get credible answers – as you navigate to a successful implementation.
Note: An earlier version of this post appeared in April 2013 on John Mancini's Digital Landfill blog.
Follow me on Twitter @CaptureGuru
Modified by Scott Blau firstname.lastname@example.org
It’s a given: the sooner you convert a paper document into an electronic image, the faster, more accurately, and less expensively you process it. Obvious though it may have been, over the 20+ years I’ve been in this business it’s not been an easy insight to act upon.
In the era of MFPs (multifunction peripherals), mobile phones and mobile data plans, it’s easy to forget how tentative data connectivity was even a short time ago. Even in a commercial setting, banks with branches, insurers with independent brokers, in fact, any organization with far-flung activities, all had big concerns about wide-area bandwidth. Scanning of documents and sending them “over the wire” from remote locations was seen as a luxury.
That perspective is changing – fast.
Converting a paper document to digital image as soon as the document is received, or even created, is a strategy now within reach of most organizations, in most parts of the world. It's called distributed document capture. It’s different from the old model of centralized capture, where everything is sent to a central processing center.
The good news is that there are now low-cost desktop scanners, mobile scanners, multi-function peripherals (MFPs), and more than a billion smart phones worldwide that can operate as a capture device. The bad news is that it’s not so simple as simply snapping a photo to be successful with distributed capture. Before you invest in a solution, you need to prepare yourself by asking some key questions... I'm putting some together to share with you in my next post.
Follow me on Twitter @CaptureGuru
Modified by Scott Blau email@example.com
It's time to start planning your agenda for Information On Demand 2013 - aka "IOD," in Las Vegas. Whether you are in IT, Operations, or Finance, IOD is a great networking opportunity: meet with peers, industry experts, and influence the architects of your current solutions. Choose between business, technical, and leadership training sessions, or use the event to expand your understanding of Business Analytics, other Enterprise Content Management (ECM) technologies, and Information Management. There are also special events with todays thought leaders. You will be encouraged to “Think Big,” but maybe just as important, you can also learn how to “Think Fast.”
The main reason to go to IOD? Capture, of course! We're putting the focus on capture in the context of "real-time imaging." What's real-time? That's the time you - and your customers - expect things to happen when they have a smartphone in their hands. Mobile is coming to capture very quickly now. Don't believe me, then come to IOD to see for yourself. We'll be showing that and related distributed/branch capture capabilities and solutions. You'll see what is available today... and if you pay close attention, we'll give you a sneak peak at the future!
Here are some specific real-time imaging sessions to pencil into your agenda... and there are more to come!
EIC-3440A: Time is Money: Coca-Cola Realizes Process Improvements with IBM Datacap, Speaker: Thomas Fantroy, Coca Cola Refreshments, Manager Imaging & Workflow Solutions\Monday, Nov. 4, 10:15 – 11:15 AM, Lagoon U
EIC-1815A: Mobile and Multifunction Peripheral Transactional Capture to IBM Datacap and Enterprise Content Management, Speakers: Anthony Vigliotti, Notable Solutions, Date: 6th November, 2013, 4:30 PM-5:45 PM| Location: Lagoon IJ
EIC-1667A: What's New with Mobile Capture, Speaker: Mattias Marder, IBM, IBM Research - Image Processing and Computer Vision, Thursday, Nov. 7, 10 -11 AM, Lagoon GH
ECG-2224B: Content Integration: A Success Story (Mobile mortgage capture at National Bank of Canada) Speaker: Alain Foisy, National Bank of Canada, ECCM Practice Leader, Tuesday, Nov. 5, 3:00 – 4:00 PM, Lagoon F
Also, be sure to take advantage of the once-a-year opportunity to meet 1:1 with IBM executives, subject matter experts and innovative IBM Business Partners. I'll be there, but you an also talk strategy with other ECM imaging business leaders, such as Brent Bussell, Feri Clayton,Brian Phelps, and Rick Gawronski. Or take a deep dive in to Document Imaging and Capture with experts from our product and technical teams, including Tom Stuart, Ben Antin, Jim Reimer, Charles Wiecha, Bud Paton and Noel Kropf.
Learn more about Information On Demand
For ongoing IOD updates, follow me on Twitter @CaptureGuru.
Modified by Scott Blau firstname.lastname@example.org
I'm just back from a trip to India. Until fairly recently, I would never have imagined significant opportunity for document capture in the very place where outsourcing of data entry has been most successful. That's relevant to the document capture business because when a document is "captured" two things happen:
- the paper document is digitized (usually scanned, but sometimes an already electronic document is converted to a standard format), and
- data is extracted from the document - either manually or using OCR - so the document can be filed, and sometimes to populate a line-of-business application.
Call it what you will - indexing, verification, keying - it is a data entry requirement that allows a document that has been digitized to be sent to locations around the world where labor is less expensive. It is exactly in this type of work that India has excelled. A vast, trained workforce has taken on the tedious task of manually extracting information from documents. Even compared to "automated," OCR-assisted data entry that requires relatively expensive labor in North America or Europe, a very competitive alternative has been to take advantage of the much lower-cost labor pool in India (and other countries) to manually enter data, without the help of automation at all.
So why was I in India? To some extent, you can say that the success and breadth of outsourcing initiatives over the last 20 years have changed the underlying economics. Although labor continues to significantly less expensive in markets such as India, China, Philippines - the usual suspects - costs have gone up substantially. They have gone up enough that it is no longer a given that throwing more manpower at a problem, such as manual data entry, is going to be less expensive than investing in automation technologies to help assist in the effort. Many organizations are coming to the same conclusion.
Even banks with far-flung operations and massive workforces are exploring ways to automate aspects of the document capture process: the volumes of documents to be captured are staggering once a bank wades into the world of branch capture. (My thoughts on how branch capture is technically something new in document capture: http://ibm.co/13i74bl.) Automation not only reduces costs, but speeds up the process, ultimately helping improve customer services… and most importantly, customer satisfaction. (And if you are skeptical that customer satisfaction is the underlying benefit of document capture, let me try to convince you: http://ibm.co/10bwmsJ.)
Put another way, in large-scale document capture operations, there is a premium on reducing complexity, including the number of people involved. Globally, the Holy Grail is to grow the number of documents being captured, while meeting that growing need with existing staff.
From my perspective, document capture has come of age when it is being adopted globally, even in markets traditionally noted for the low cost of labor.
To continue the conversation, connect with me on Twitter @captureguru.
Modified by Scott Blau email@example.com
It's that time of year - beautiful days in New York, flowers, clear skies… and Smarter Commerce in the air. A year ago I was in Madrid at the IBM Smarter Commerce Summit - I haven't been the same since! It wasn't the protesters teaming in the streets calling out the effects of austerity on youth unemployment, or the luxurious resort (in fact, just another airport hotel) setting for the Summit. It was the conference content...
The keynote was the most interesting presentation I've ever seen at a technology conference (and I have been to many over the past 25 years)! To signal something special was coming, the lights in the large auditorium were turned down. A hush fell over the otherwise bustling room as two people gingerly made their way up onto the stage. One was clearly bind and it felt a bit awkward to watch him feel his way up multiple steps and across the open space. This was clearly not going to be a standard set of Powerpoints...
Over the next half hour the blind presenter on stage held the audience spellbound. His role concerned customer experience management at ING, the Dutch bank. He started off with a simple statement of the problem he faced: since the financial crisis started in 2008, customer trust of banks had hit an all-time low. But without trust, how can you have active, and growing, banking relationships?
It was not the blind leading the blind. This fellow could see clearly that for businesses to thrive, to acquire new customers, to retain existing customers, business didn't need technology that could reach further into a customer's wallet, but a perspective on the customer which focused on creating an individual customer experience... for every customer. Although this was a technology conference, he hardly spent a minute, and not a single slide, on technology.
The focus was on the customer… The presenter told the story of a customer who gets distracted by a phone call while withdrawing cash from an ATM - and walks away leaving the cash in the machine. Apparently, this happens thousands of times a year. As the presenter pointed out, this aborted transaction leaves a lot of anxiety behind. When the customer finally realizes what they have done, a minute, an hour, or a day later, the first thing they want to know is, "where is my money?" Of course, the bank knows: for years cash machines have sucked the money back in and re-deposited it. But the customer doesn’t know that and is left in the dark.
The solution - the customer-centric solution - to this problem was easy: send the customer a text: "Sorry you missed withdrawing €150 from our ATM at X branch - come back again when you can, as we have safely re-deposited it into your acct."
"Easy" conceptually, but monumental for the bank. Why? Because branch and ATM services, although they have access to account information, do not have access to customer contact details squirrelled away in account management and customer service systems. Addressing this challenge requires a willingness to break down long-standing barriers between data silos in the bank.
As I listened to the presentation, it dawned on me what Smarter Commerce was all about. It means engaging with customers in a way that makes the customer feel special, because they are individuals, rather than just a number or one of many.
My head was swimming by the time I walked out of the auditorium. After so many years of focusing on document imaging and capture, I could now see that the value we are offering our customers is not just improved productivity, but the opportunity to help our customers serve their customers better!
I'll share more on this topic soon, but for the time-being, the 2013 version of the Smarter Commerce Summit just finished last week in Monaco. You can follow @IBMSmarterCommerce. And here's a perspective on the conference from someone who started thinking about these things a long time before I did - Buy Sell Market Service - When did ECM become a Monte Carlo Celeb?
Modified by Scott Blau firstname.lastname@example.org
I recently had the occasion to meet with several banks who are contemplating, in the course of implementing, or have gone live with the capture of documents in their branches. For reference, in a recent Celent survey, the number of banks indicating they were highly likely to replace/refresh core system over the next 3-5 years jumped from 17% in 2010 to 24% in 2012. Credit unions responded similarly, nearly doubling from 13% to 24%. Each of these implementations is unique, as they reflect the different cultures of the banks involved, as well as specific business and IT issues.
Nevertheless, the migration of document capture from central locations to branches has some universal themes tying all these implementations together. Compared to "traditional," high-volume, centralized scanning, branch capture may herald the future. Here's why I think that and what the implications of the transition are...
20+ years ago when the document capture business got started - it wasn't even called "document capture" then, "forms processing" was the preferred name - it was all about bulk processing. Documents were being brought together anyway, so instead of keying data from them, we helped customers scan them and use recognition to automate the keying of data, or capture of data (as in "Datacap").
The focus of centralized scanning is batch efficiency. Larger batches means less overhead between batches. Some customers take this to the natural extreme of "continuous scanning." In that scenario, fast, high-volume scanners are kept working constantly. Rather than having an operator scan a batch, then stop and put it to the side while loading a new batch in the scanner, multiple batches are loaded together with only a batch separator sheet between them. The software takes care of the details in the background.
Branches don't have the kind of volume that necessitates continuous scanning, or even necessarily have "batches" that consist of many pages. Typically in a branch a "batch" is just a single transaction with a customer, perhaps an account application form, ID card, or other documents that are associated with one customer. And, of course, there may be lots of these batches, not necessarily at any one location, but from all the branch locations of a bank.
So one characteristic of branch capture is small batches, but lots of them!
Not only are the batches small, but there is a completely different dynamic associated with processing them. After all, the customer is waiting for the teller or bank officer to respond to what they just submitted to them. In contrast to most bulk scanning operations where processing times of a few hours are considered not only acceptable, but big steps forward in efficiency, with a customer drumming their fingers on the counter, branch capture has a near real-time requirement. 30 seconds, a minute, maybe two, but longer than that and people start to get impatient.
Branch capture systems have to be tuned for very snappy turnaround. Reduce the already short time to process a document and the branch quickly acquires a new customer or retains an existing one with higher satisfaction. Make them wait, and... the negative consequences are immediate!
Capturing lots of small batches remotely with minimum latency puts new demands on a capture system, including the time it takes to transfer images for centralized processing and then to make them available to the branch user to complete a transaction.
But I don't think this is something that is going to be limited to the branch banking. By processing customer documents, while the customer is there in person, any business can improve customer acquisition, retention, and satisfaction. You can never get enough of those metrics!
" Lets continue the conversation connect with me on Twitter @captureguru "
Modified by Scott Blau email@example.com
4 Non-Trivial Questions to Ask before Committing to Production Document Capture
In late 2009 and I got a call from the brother of a good friend. He was a researcher at IBM's Watson Labs - soon to became famous for the "Watson" artificial intelligence engine that spectacularly beat the top humans on the trivia game-show, Jeopardy!
My friend was trying to solve a problem and thought that my company, Datacap (the acquisition of Datacap by IBM was not even on the horizon at this point), could help, since we specialized in optical character recognition (OCR) and related document capture technologies.
I said, "great, let me ask you 3 or 4 questions about what you are trying to do:
1) What is the volume of documents/pages/images you need to process per day, week, month, or year?
2) What data do you need to extract from those pages, any special considerations to take into account?
3) Are the pages consistent in format, variable, something in between?"
He said he had 5000 pages. Clearly to him that was a big number, but he was a bit deflated when I asked, "is that per day?" In the production document capture business, it is definitely common that a volume like that may be literally processed "before breakfast."
But 5000 pages were all he had. Not every day or week, or even every month, just once. I was a little skeptical, but I wanted to learn more.
He needed to extract information from an English language pronunciation guide. He wanted to read the word to be pronounced, and then the linguistically precise definition of the pronunciation, including diacritical marks (accents) commonly used in those definitions. In other words, this was not just straight English language OCR. My skepticism increased.
I wasn't surprised when I next learned that the pages were not at all consistent, that the definitions for a specific word could wrap from one page to the next, or that the pages to be scanned were in bound books...
That was it. Did he really expect to use a production capture product to process - one time - 5000 pages with specialized text and words on them and no fixed format? Well, yes, he did. He had a real challenge and his expectation was not unreasonable... it just is not what production document capture is about.
Those three questions can help anyone quickly assess a document capture problem. In this case, the answer was simple, but perhaps wrong. I advised him that it would not be economically feasible for him to invest in production document capture, but in giving that answer I missed a great opportunity.
Turns out I should have asked a 4th question, "why do you need to read a pronunciation guide?"
I learned later that my friend was working on a major artificial intelligence project, one that would need a computer capable of blurting out words under extreme time pressure. He was, in fact, working on giving "Watson" a voice. It was that voice, having been trained to enunciate thousands of words, that went on prime time to beat the best human players at a live game of Jeopardy!
He eventually used a desktop OCR program and a lot of patience to translate the pronunciation guide from paper to something Watson could understand. Although my 3 questions helped me quickly assess the value of the opportunity, by skipping the 4th question, I missed the opportunity to brag how Datacap helped to give Watson a voice!
Is production document capture and imaging right for you? Click here to learn more on using capture solutions.
Guest blog by Scott Blau
, WW Director of Document Capture, IBM Enterprise Content Management
When I think about what Smarter Commerce can mean to a
customer, I think of all the reasons I love shopping on Main Street. I don’t do a lot of shopping in person, but
when I do, I have pretty high expectations. The places I go to – and return to
– all share some common characteristics:
me. I can tell because when I walk
in the door, someone smiles at me like a friend!
remember me. At my café, I don’t
need to ask each time for skim milk in my coffee.
care of me. When I have a question about
my bill, they look over my shoulder at it and we go line-by-line to sort out
These days most of my shopping is actually done online. It’s a very different experience from
shopping in a store. When I go into an
online shop nobody smiles at me. They
rarely remember much about me. And when
I have a question about the bill… ouch!
The out-of-touch call center can’t really take care of me and rarely can
even look at the same bill I’m looking at.
There is very little that is “smart” about this commerce.
Sure, eCommerce has changed the way I shop and my
expectations on the speed of transactions, but I still miss the human touch
from the era of Main Street shopping. It’s
harder than ever to satisfy me as a consumer, because now I want the best of
eCommerce married to the best of Main Street.
I want truly smarter commerce!
To get instant – and accurate – feedback on my
transactions based on my input
To have a personalized experience where “the system”
knows me and remembers my preferences, “anticipating” my next move
And when I speak to someone on the phone – I
really expect them to take care of me
as a valuable customer!
full of systems that don’t speak to each other
To meet these high expectations requires a concerted (some
may say monumental) effort to break down the barriers between systems. If I’m calling Customer Service, I don’t want
to explain what products I have purchased from the company. If I am disputing a
charge on a bill that I have in my hand, I expect the person on the other end
of the phone to be able to see exactly the same bill I am looking at.
Being able to meet my Main Street expectations in the
eCommerce world is where smarter commerce started at IBM twenty years ago, long
before the term “Smarter Commerce” was coined.
A product now called Content Manager On Demand (CMOD) made it easy to
efficiently store images of bills being printed before they were sent to
customers. So when I call the company to
sort out a billing issue, the customer service rep can easily pull up my bill
and see exactly what I am seeing. That’s
a good place to start to deliver excellent customer service.
ECM bridges the gap
between siloed systems
ECM is good at this because it represents a set of
technologies that often are used to span otherwise rigidly siloed systems
within an organization. Document imaging
often does exactly that – making documents that originate in one area of the
business, say orders, available in other areas, such as Customer Service. This is important when customer service wants
to see, for example, a customer’s original purchase order.
Case Management – another ECM technology – is great at managing
customer interactions in Support or Customer Service. It excels because it avoids using rigid
process management. Instead, case
management offers the ability to deal with the ‘randomness’ of customers who
don’t always fit into pre-defined patterns of interaction. Turns out that when your customers are people
they tend to behave like people!! And
people don’t tend to follow pre-defined patterns of interaction.
Paper documents continue to challenge organizations that have
otherwise committed to electronic commerce.
They have paper order forms that won’t go away and paper invoices. Document capture technologies – like OCR and
ICR – turn paper into an electronic, “p2e,” compliment to eCommerce. And these ECM staples are at their best when
they dovetail with an organization’s existing electronic systems.
ECM: Turning eCommerce
into Smarter Commerce
Commerce gets smarter, a step at a time, by using
technologies that help hide “systems” and instead present a personal face to
our customers, our suppliers, and even our employees. I see IBM ECM as a good place to start transforming
your eCommerce into something as pleasurable as Main Street shopping – that’s
when commerce really gets smarter!
To know more about how ECM drives Smarter Commerce, attend our sessions on Smarter Processes for Smarter Commerce
and Find the Voice of Customer
at IBM Smarter Commerce Global Summit, Orlando 2012 from September 5th to 7th. To know more about the sessions and register to attend the IBM Smarter Commerce Global Summit 2012 visit the micro-site