The Classification World
Happy New Year.
With the new year, I've decided to shift my writing and publishing efforts over to a new blog.
I've entitled that blog "(un)structured". So for all you loyal "Classification World" readers (hi mom!), mozy on over to my new blog, where I'll not only be covering content classification topics, but other topics in content analytics, the intersection of structured and unstructured information and other issues in ECM and information management.
Of course I'll continue to use twitter (@joshpayne) and look forward to more writing and interaction in 2010.
The Information on Demand (IOD) conference, the last week of October, was an exciting week for me personally. At the conference, we made some new product announcements on projects upon which I’ve been working pretty hard. (and as such its taken me this long to peek out from the resulting pile-up of work to comment here). It was gratifying to see a lot of these ideas really spread beyond the virtual walls of IBM.
Today, I’ll give color on one of the announcements – InfoSphere Content Assessment.
I’ve been thinking about this concept of content assessment as part of my classification work for the past year or so. In my travels, when I introduced the idea of “content classification” to enterprise content customers, I found that the phrase could mean very different things to different people – and often times different from what I thought it meant.
I was frequently framing content classification as an automated tool for taking action – classifying content as part of the process of organizing unstructured information. Classification as part of the email archiving process. Classification as a tool for executing records management.
Yet many of my customers would hear about content classification and conceive of it as a tool for understanding what they have, divorced from the idea of taking immediate action (at least taking action in the short term). These customers knew they had a lot of information in an unmanaged state. They knew it posed some level of risk to their organization, but they weren’t sure what type of risk. They weren’t sure where the risk was the highest. They weren’t sure where the ROI for tackling the risk laid and where the ROI was weak.
These customers wanted to assess the state of their “content in the wild” and figure out their action plans. Their IT groups are great at telling them the hard stats based on information outside the content – the amount of disk in the enterprise, the utilization of that disk, etc. But IT couldn’t tell them what lay within those millions of files. How do they know what content needs to be saved before they can decommission the system? Where do we start our records program? How can we assess the potential ROI for a records program? All these questions beg for insight that only the insides of the content residing in the enterprise can give. Content classification and, more broadly, content analytics provide this kind of insight to help answer these bigger picture questions as organizations seek to gain better control over their content.
Customers were frequently saying to me “I have 10 Terabytes of content, unmanaged. I don’t know what I should do. I don’t know what I can do with it. Its too much to sift through manually.”
InfoSphere Content Assessment is designed to help those customers.
That is the short introduction on how I came to realize the importance of content assessment for organizations. Others who were at the conference have written up their own impressions of Content Assessment, including Brian Hill of Forrester and Fern Halper of Hurwitz Associates.
Last winter, the TV program Frontline helped cut through the noise and confusion of the economic recession. They aired an episode entitled “Inside the Meltdown” that explained the happenings in a clear manner. The episode focused on the concept of moral hazard – the simple idea that a party “insulated from risk may behave differently form the way it would behave if it were fully exposed.” If you know someone is going to back you up, you’re more likely to take risks because the downside of those risks is softened.
was reminded of this concept a few weeks ago when I spoke to a gathering of ECM
customers and partners in
My view on this is that end-users are unlikely to participate for a host of reasons – that’s one of the main drivers for incorporating automated classification in the first place. So to worry about reduced participation at that point . . . is like worrying about closing the barn door after the horses have gotten out.
The other incentives to participate (or not), are more likely to determine users’ behaviors. Things like knowledge on the topic, level of distraction and personal stake in a positive outcome.
Regardless, I found the link to moral hazard interesting, and hope you do too.
What do you think?
I’ve had a chance to talk to more customers and business partners over the last few weeks and months about classification. Frequently these customers ask me “How long is it going to take for me to get up and running?” Or if they’ve learned more about our product, they ask “It seems like its going to take a lot of effort to train the system?”
Well educated, motivated knowledge workers don’t want to be forced to take on a dreary task like dividing documents into categories. And they see some in the future with a classification training process.
Typically, I provide a boring “It depends” answer, and dive into the variables that come into play. In the end, there’s a big range, depending on how well your business policies are defined and in turn captured in your business applications like records management systems. I’ll post more on the topic in the future. We can automate this process and make the classification training process less burdensome itself. But bottom line, there’s going to be some necessary dirty work.
But my colleague, Michele Kersey, provided a great answer to a customer this week on this question that cut to the heart of the matter. The quote that stuck out in my mind was “Content classification scales out the human element.”
The phrase struck me for two reasons. First, it nicely encapsulates the difference between rule based classification and the more advanced, context-based classification methods. Rule based methods force humans to try to classify content under the constraints of binary logical. Lots of IF-THEN statements. Sure, computer science majors can think that way, but John in marketing and Jane in HR don’t think in that manner. Embarking on rule based classification project exposes the classic IT/LOB gap.
By taking an approach to content classification that relies on a training-based approach (using example documents to “teach” a classification engine), training the system with samples provided and/or validated by those business stakeholders, you are encapsulating the human element in your classification logic. A human has written the documents that are being used to train the system. And a human is choosing these “typical” documents.
This training process takes effort, but the scale and scope of that effort pales in comparison to the effort you’d need to harness otherwise – which is the second reason I liked Michele’s quote. Automated classification using training based methods scales out the effort your organization has put into training the system. Yes, it takes effort to train a classification system, but you’ll earn back savings on that effort in the following weeks, months and years that the human-based logic is applied and re-applied to answering categorization questions that your workers would otherwise need to handle.
Train the system once, with a small set of workers, and the same work that those workers had to do manually, will be executed automatically over and over and over again throughout your organization. Do some work to plant the seed of classification and watch it eliminate repititive tasks throughout your organization.
Joshua Payne 060000XYQN firstname.lastname@example.org Tags:  precision_and_recall ecm classification 2,346 Visits
Precision and recall. It’s a topic in that is frequently misunderstood in the search and classification market. It took me a couple of years hanging around the PhD's to get my head around the concepts. Every time I got confused, I'd check wikipedia and I'd just get more confused because the wikipedia entries on precision and recall seem to have been written by the PhD's, for the PhD's. Even my own products’ documentation confused me. False negatives? False Positives? I just got it all mixed up.
When it comes to one hour introductions to classification, this is one of those topics that gets hyper-simplified under the banner of "accuracy". There's so much other new technology and application topics that need to be covered that we just don't pick this battle when we talk about content classification to customers. But this blog is close to a year old and this topic has come up not once, but three times in the last week or so with customers and business partners. So I figure its time to address it. The audience is ready (I hope).
Where should we start? Ok. Let’s start with the topic of perfection. Now I hope this doesn't come as a shock to you, but these automated methods of classification that I've been touting over the last year or so on this blog and on twitter (and in embarrassing videos) aren't perfect. They’re not perfect? Shocking, I know. But automated classification doesn't get it right 100% of the time. There are two ways to handle the fact that automated classification isn't going to get it right 100% of the time. One approach is to . . .
1) Be exclusive. Be a snob. Only accept the best results. Only accept results from automated classification that the system is highly confident are accurate. If we're not close to certain that this is the right answer, then don't accept it. Being highly exclusive like this is what is meant by having high precision. If you insist on high precision, you won’t categorize all your content automatically (you’ll skip over a lot of the content and leave it uncategorized), but the ones you do categorize, you'll do so with reasonably good success.
The other approach is to . . .
2) Be inclusive. Be a populist. Accept the best answer (or answers), no matter what. Use the top classification guesses despite low levels of confidence in their correctness. What's the impact of this? You'll maximize the gross number of right answers that you get, but it comes at the cost of getting a bunch more wrong. You'll be handling as much as possible automatically, you'll be getting the highest number of answers right, but you'll also be getting more answers wrong. Being highly inclusive means you have high recall.
In real life applications of classification (or search), you can't have it both ways. Everyone wants to have high precision (our answers are always right!) with high recall (we answer every question correctly) . . . but unfortunately that's not realistic. Even our best methods of classification are imperfect. An organization typically needs to make a decision as to how to balance the two factors of precision and recall. Is it more important to try to get more answers right? Or is it more important to have the answers you do provide be correct? The more answers you get right, the more answers you're also going to get wrong.
For the graphically inclined, tradeoff can be visualized as follows:
The further you move to the right, the lower your precision gets and your recall improves. Organizations using automated classification need to determine what their curve looks like . . .and then determine what point on the curve is right for them.
How do we handle this? With any tradeoff decision, you as an organization need to determine where your priority lies. How are you using classification? For what purpose? What is the impact of the automation?
In some situations, a bias towards high recall is appropriate. For example, I might be using automated classification to populate navigation options for users as they attempt to find content. The user might not expect perfect documents with each navigation option and as such, it’s worth the trade-off to have more classifications executed.
On the other hand, I might be using classification to determine what content has business value and what content does not have business value. I might be using automation to slice off the content that I'm highly confident doesn't belong in my ECM repository to reduce down-stream costs. In this case, high precision might be warranted. I want to only get rid of the content I’m reasonably certain I can excise.*
*These are two simple examples. Don't take them as gospel advice. Analyze your own situation carefully
Once you've thought through your policy on making the precision/recall trade-off, there are various "switches" in your automated classification deployment you can use to put a strategy into action. The two biggest ones that come to mind are:
1) Confidence Threshold. Each classification response, with advanced methods, typically comes with some sort of confidence score for the classification response. If you want high precision, you can set the confidence level at which you accept responses for categories to be very high. If you want high recall, set the confidence level at which you will accept categorization suggestions very low.
2) Number of classification responses to accept. A simple way to increase your recall is to expand the number of classification suggestions you incorporate. With advanced methods, you are asking the automated classification capability to assess similarity. Therefore the classification engine is assessing the similarity to every category upon which it has been trained. Rather than simply accepting the most similar category, you can decide to take the top two suggestions (or three . . . or four).
For example, a recent IBM customer assessed the applicability of automated classification for automatic assignment of the records management disposition policy for each email they archive. The customer realized that certain types of emails were being inaccurately classified due to how they trained the system. Their examples for two categories were very similar and overlapping. When they dug in, they realized that the training set had given mixed guidance to the classification engine -- it was an instance in which the business stake holders themselves (the records managers) had frequently clashed on how to proper classify a certain set of content . To resolve the situation (before the advent of automated classification) they had made the policy decision to apply both classifications to the content. As a result, they decided to carry forward the same business policy to their automated classification. Their policy decision implicitly impacted their recall.
Some readers might blanch at the thought of applying this kind of practice across the board. It might not be right to create universal rules to control precision and recall. But you can exercise these two controls on your precision and recall in more focused ways -- building a set of rules (based on your business policies) control precision and recall for specific categories. For example, you might configure an extra rule in your system to accept the second suggestion if and only if it’s a particular category.
It’s important to define your business policies as they will impact your technical decisions around precision and recall. If you're assessing and investigating automated classification products and technologies for use in your scenario, look into the tools that the vendor provides to help make this trade-off decision. InfoSphere Classification Module has a set of reports and workflows for making informed precision/recall trade-offs. Expect them from your own classification tools because this is a critical set of technical and business decisions that will impact the success of your classification project.
And if you have your favorite precision/recall explanation – share it in the comments. I’m sure there are better ones out there.
Another year, another release! I am happy to announce that our advanced content classification product will issue its next release later this week. Version 8.7 comes out on August 20 to be exact. The formal announcement letter that IBM makes me do is no fun – and press releases are so pre-web 2.0 – so let’s take a run-through of some of the new changes, features and improvements we’ve lined up for the product.
The first change of note is the name. We’ve added the InfoSphere brand to the name so the product is now known as IBM InfoSphere Classification Module. The InfoSphere brand is all about trusted information. Content analysis and organization is a key element of enhancing content to so it becomes trusted business information.
Of course we’ve made a lot of improvements to the software beyond simply changing the name. The overall emphasis of this release was to continue our focus of improving the Classification Module’s ability to provide advanced classification for our Compliant Information Management customers.
A quick bulleted list of some of the features we’ve added:
Let’s dive into some of the details.
IBM InfoSphere Content Collector Integration Improvements
Significant improvements are delivered to its core product to facilitate the use of the Decision Plan functionality by IBM Content Collector. Introduced in V8.6 (released in 2008), Decision Plans provide for two major elements of classification functionality:
V8.7 provides functionality that will allow future versions of IBM Content Collector to efficiently utilize Decision Plans. With Decision Plans, Content Collector users of the Classification Module will have access to a full range of classification analysis methods, improving classification accuracy in content collection and archival scenarios.
Decision Plan improvements
In V8.7, Decision Plans themselves have been enhanced as well.
First, the pattern matching functionality has been enhanced. In prior versions, Classification Module customers have been able to define patterns for identification and subsequent extraction from long form text. This pattern extraction capability is one of the rules-based methods of classification provided by Decision Plans to identify and extract patterns such as account identifiers and customer identifiers.
In V8.7, users of the Classification Workbench can define these pattern matching and extraction rules using the full, standard, regular expression syntax with which IT users are familiar and trained.
The Decision Plan functionality has been modified such that it is now extensible. Do you require a custom method of classification analysis in addition to those provided by the InfoSphere Classification Module? Or do you want the InfoSphere Classification Module to validate its assignments with an outside, trusted source of information? The open API now included with the InfoSphere Classification Module allows for such customizations by IBM customers and Business Partners. The Decision Plan functionality now provides call-outs to allow for custom programs to analyze the filtered documents and any categories already defined by the Classification Module.
In V8.7 of the InfoSphere Classification Module, the
If you wish to use the product with other languages, the InfoSphere Classification Module provides a generic language processing option.
Workbench. The workbench administration tool now has improved usability providing for a more intuitive navigation experience via a tabbed navigation screen paradigm. Uninitiated users will find it easier to execute tasks more quickly. I can personally attest to its improved usability. I’ve been using it with great regularity over the past two months and have little to complain about.
Sample projects and tutorials. New sample projects have been added to help users begin classifying content more rapidly. With the new tutorials, you can explore the capabilities of the product on your own, at your own pace.
Five Random observations regarding the new Cohasset Associates Whitepaper, "Meet the Content Tsunami Head On: Leveraging Classification for Compliant Information Management".
1) All due respect to John Mancini and the "Digital Landfill", I like the tsunami metaphor Cohasset uses. Plus it makes for a nicer theme to weave throughout the paper (as compared to a them based on the garbage of a landfill).
2) The paper describes records management as representing the "organizational memory" for enterprises. As someone who was new to records management 2+ years ago, that nicely turned phrase resonated with me. It makes for a tidy justification, in a few words.
3) The abstract mentions, when discussing automated classificationclassificaiton, the need for "best practices and technologies that scale and adapt to meet the information governance challenges at hand." The need for "scalable" policies is a succinct way of justifying automated classification. Another tidy justification.
4) On the topic of scaling, "legacy practices from the paper world simply do not scale" is another idea that reminds me of some recent customer interactions (and discussions with my IBM colleagues). Too often customers are saying that yes indeed, they have a file plan but it just hasn't been transferred over to their electronic records management needs. Maturation of record keeping and archival practices just hasn't kept pace with the rate of innovation when it comes to creating electronic content.
5) I like a bunch of the stats that Cohasset was able to pull together, like the cost of classifying departed employees' content . . . or the comparison of human classification to automated classification.
I don't do full justice to the paper. Check out this classification whitepaper today if you're interested in the topic.
My parents came to dinner last night. Wracked with guilt over not having posted to this blog in close to a month (sorry folks, vacation and then a crazy post vacation deadline for a new project), I was trying to brainstorm on new ideas for the blog when I recalled a memorable (at least for me) vignette involving my dad.
I was probably 17 years old and had been dispatched to pick
up my dad at his office. He had been off on a long business trip and had just
returned directly from the airport to his office without a car. He needed
something from his office. As a newly minted driver, I went to pick him up but
he had yet to find whatever it is he needed when I got there. I sat in his
office as he rifled through his desk looking for the critical paper or object
or whatever it was (my father isn’t the most organized person, his office was a
mess – similar to the desk at which I’m writing this). While he rummaged about,
he was listening to the voicemails that had built up while he was gone. Corporate
voicemail systems were a relatively new development. It was like the voicemails
were background music to him. He was barely paying attention to them. Finally,
I piped up.
“Dad, aren’t you going to write these messages down?”
He looked up from his search and said “Nah, if any of these
are really important, they’ll call me back.”
It was a lesson that informs me to this day. I don’t get too
upset over the horrible state of my inbox after a vacation or time away from
diligent inbox management – the important stuff pops back up to the top through
the persistence of the truly motivated colleague.
The same dynamic is at play today with our archiving obligations around email and all the other manners of communication and collaboration. The amount of information being pushed on each of us mushrooms every year. And our ability and willingness to process it, let alone fulfill compliance obligations around it cannot keep up.
My dad could barely find the time to listen to his voicemails,
let alone follow up on them close to 20 years ago. Now you want him to
carefully file each email with business value? Without any automated help? It’s
a laughable proposition (if you know my dad). But he’s not so different from
all of us. Many employees are going to make the same kind of trade-off
decision. The incentive just isn’t there to force them to handle classification and organization
of their information. They need help.
Joshua Payne 060000XYQN email@example.com Tags:  email classification ecm aiim compliance 639 Visits
John Mancini, the president of AIIM, keeps a blog called "Digital Landfill." For the last week or two he's opened up his blog to guest authors, each following the format "8 Things You Need to Know . . ."
Today, I took a guest turn, writing "8 Things You Need to Know about Content Classification and ECM." Check it out.
Joshua Payne 060000XYQN firstname.lastname@example.org Tags:  classification ecm collaboration compliance 374 Visits
I took great interest in the announcement by Google of their impending "Wave" product. Certainly check out the demonstrations or the associated analysis if you have the time (both pro and less so). Some very flashy technology and a new take on collaboration. Good stuff.
The thing that I kept on thinking about as I watched the demos is "how does email" fit into this paradigm? The team at Google seems to position their "Wave" product as a replacement for email and instant messaging, but there's no way an enterprise, even if it embraces Google's new communication method, can get rid of email. Even small organizations would need to maintain email as a communications method. This isn't a replacement in the short term -- its a complementary piece.
It just further backs up my point that we're adding more and more communication and collaboration methods (Google Wave might soon be added to my standard list of blogs, wikis, SMS, twitter). And with new, more efficient methods of communication comes the proliferation of more original information that will have a life cycle of its own.
Increasingly organizations expect to leverage and re-use all of this original information. The markets for applications such as search, archival, records management and eDiscovery continue to grown. And with the growing need to re-use of this information throughout a document life cycle, so does the need to make more decisions about that content. These decisions require an analysis of the long form text that makes up that content.
More information + More decision throughout docouments' life cycle = A mushrooming of possible decisions that need to be made on that information. There's a content decision-making scaling problem that is emerging.
And the scale-out of decision making can only be handled by an automated method. In the face of this convergence of factors, treating your employees like Amazon's Mechanical Turk just isn't viable. Advanced content classification is the means for executing those decisions at scale.
Quick note: I'll be speaking on the topic of classification at the Boston Regional Usernet meeting this Thursday in Framingham, MA.
For more information: https://www-950.ibm.com/events/wwe/ecm/ecmruns09.nsf
Let me know if you'll be there.
I recently read Malcolm Gladwell's recent article in the New Yorker. On the surface, its about choosing strategies when you're at a strategic disadvantage. The article studied these "David and Goliath" problems mainly from the perspective of basketball strategy such that Gladwell had the sports-fan in me hooked from the outset. Not only that, it touched on topics of organizational behavior, software and war games. I ate it up -- and it gave me at least two, if not more, ideas for blog postings.
One of the lines that struck me was from when Gladwell is discussing the success a computer scientist had in entering war game contests in the early '80's. The scientist's program, dubbed Eurisko, developed unique (and in the real world, morally dubious) strategies for competing in simulated naval battles. Eurisko purposely sunk their own ships when damaged, for example, in response to a new rule in the "game". As quoted by Gladwell, "Eurisko was exposing the fact that any finite set of rules is going to be a very incomplete approximation of reality"
I loved this quote when I read it. Its a very similar phenomenon in the content classification world. There's a definite place for organizing and classifying your information based on rules. But no matter how many rules you build by hand, its still going to be an "incomplete approximation of reality." You'll never get it totally right with rules. And you'll need to constantly update them to ensure they're keeping up with the changing environment (a challenge with which the war-gamers did not have to grapple).
That is why, when automating your content classification, an approach like the one taken by the Classification Module is important. By learning from examples and training based on the full breadth of the language used in sample documents, the simulation of an expert's analysis will be far richer than any rule-based approach devised and maintained by humans. And more consistent. And more cost-effective.
A learn-by-example approach, in layman's terms, creates a far more comprehensive set of rules that does a more comprehensive job of approximating reality. And by automating these classification tasks, isn't that what you would prefer?
oh, and if you saw the headline and were expecting some Matthew Broderick movie nostalgia, try this clip.
(note the "self-learning" reference . . . not quite what Classification Module does but the writers definitely had done their computer science homework)
A short, belated post to let folks know that there is a classification session at each location of the IBM ECM Regional UserNet series this spring. I'll be delivering the talk myself in Atlanta next week (May 7-8) and in Boston (May 28).
Joshua Payne 060000XYQN email@example.com Tags:  compliance baseball ecm classification 457 Visits
Last Friday, I went to the Red Sox game with my buddy Dan. It is fun to watch sporting events with Dan. He views things in a half-glass empty kind of way when it comes to our sporting heroes. "Ortiz can't catch up to the fastball anymore." "Lopez can't throw strikes." It goes on and on. (Don't get him started on the Mo Vaughn era in Red Sox history -- you'll get him going for hours). He sees their flaws and isn't afraid to bemoan the imperfections.
I'm a bit more of an optimist when it comes to my escapist, sports watching fun -- so I have fun playing the foil to Dan's criticisms. "Ortiz can still have a big year." "Lopez, when used in the right situations, is an effective pitcher." "The season is still early!" It becomes a light-hearted debate.
The same dynamic arises when I talk to customers about automatic classification.
There's always a "Dan" in the audience asking about the accuracy of automatic classification. "What's the accuracy of your system?" they ask. Any response short of 100% leaves them unsatisfied. They say things like "if its not 100% accurate, we can't rely upon it." If the new method isn't perfect, then they don't want to adopt it. The wall goes up. "We'll rely on our users".
Sure, automated classification isn't 100% accurate -- but neither are the humans they're replacing. In fact they're far from perfect (a case I've made before on this blog) and even worse, inconsistent in their logic and participation. The voices demanding that auto-classification be perfect are holding it to a standard that the current methods (complete, unaided reliance on humans) don't meet. And in fact, they do worse. You'll get more accuracy, more consistency and most importantly more cost savings.
Last week, I reviewed the results of the poll during my KMWorld webinar. I discussed the emphasis organizations have on employees to analyze, classify and act on their email and documents for archival. About half of the respondents to the poll said they rely completely on end users for executing archiving. I argue that this is a poor, costly use of employees' time. Employees should be focused on their primary, line-of-business supporting responsibilities. Not only is it costly and unproductive for the employees, they're not likely to participate or participate with consistency. The end results are exactly what you don't want: an inconsistently constituted archive. Gartner agrees. Automatic classification can address these issues.
But that's only 50% of organizations that archive. Let's flip this argument on its ear, as ask "what about the other 50%?" Well, for those, very often they respond to the issues I've highlighted above by rightfully keeping their users out of the loop. That burden is too high, and only growing worse. So instead, they respond by saving everything in their archive.
Expecting users to classify, without help, their email is an extreme response to compliance issues. The flip side, saving everything without an ounce of organization or filtering, is just as extreme. Archives end up containing loads of information with very little value to the business. Every email between me and my wife coordinating dinner. Every confirmation email from amazon.com. Every job solicitation email. Every organization wide announcement email.
Records Management specialists estimate somewhere between 5 and 15% of emails are actually business records -- a nice proxy for determining whether a piece of information has value to your business. By saving everything, you're saving a whole lot of 'stuff' without any business value. But why is this a bad thing?
1) "Disk is cheap", but its not free. My rule of thumb is that an average email is 100 KB. Average users get anywhere from 25 - 75 emails a day, depending on the researcher. 225 working days a year, at about $4.50/GB . . . the costs for disk storage for an archive get into the millions, annually, very quickly, for mid-to-large scale organizations. At a 10,000 person organization, that cost is about $3.8 M.
Those costs are growing and only going to get worse as the the creation of new content and information is outpacing the corresponding drop in storage prices, according to Gartner.
2) More information saved means more information clutter later. Why is this important -- it gets to why are people saving and archiving email and content, centrally. Most are driven by the need to support legal discovery requests efficiently.
And the more email you save with limited or no business value you save, the more time a $400/hour lawyer is going to spend analyzing those emails. Smaller archives means fewer emails for the lawyers which means lower legal costs.
If you follow a reasonable, consistent archiving process that includes automatic classification, you can cut your hardware costs and keep more of your organization's money from flowing to lawyers. Without annoying your users or relying on their whims as to when, if and how they want to participate.