The Code Hit the Fan: java

Showing posts with label java. Show all posts

Friday, June 10, 2011

Alfresco and a Custom TaskControllerHandler

My work has required me to create an advance workflow in Alfresco to facilitate the process of how the users insert documents and apply the corresponding metadata. The outline of what needed to be done is this:

Upload a document
Start a new workflow for that document
Require the user to select a document type from a list of subtypes of the current document type

Under the covers, Alfresco uses jBPM. It actually has been integrated quite nicely, but there are certain features that are not readily available after their marriage. It does not provide a way to display a list of options out of the box, unless it is static.

There are hacks that allow one to do this with a ListConstraint, but a hack isn't always the best way to go (think fitting a square peg in a round hole). A ListConstraint almost fits the need here for selecting a document type, except its missing one crucial element: access to the node in the workflow (or any node for that matter). In order to get a list of subtypes, I need to check the node's current type. There are ways around this, but they aren't exactly best practice. So this option was a non starter.

I looked into using actions before and after the task to populate the list and change the document type, but this seemed to be a bunch of extra work and also a bit of a kludge.

Another option was to use BeanShell script or Javascript directly in the workflow. I can't say I'm a big fan of putting code directly in the xml of the workflow, nor am I a fan of server side Javascript (too hard to debug).

Some searching on jBPM revealed that it is possible to map properties to a task from the process context (or whatever source you want) via a TaskControllerHandler. The task controller excels at allowing the programmer provide data to the task that is not a 1 to 1 mapping of what is in the process context (which is why configuring the default task controller in the workflow was out of the question as it only does 1 to 1). This seemed to be the option to go with, however, there are several pain points I came across implementing this. The rest of this blog post will address those points.

The process I outlined of what the controller needed to do:

Retrieve the document node and determine its child types
Present that to the user via the task instance
User is able to select one of the options
Upon submission, the document node is changed to the new document subtype

First thing I needed to do was obtain references to the Alfresco services. At first I wasn't sure how to do this, since the TaskControllerHandler is managed by jBPM and not Spring so dependency injection was out. I've done some work using workflow actions in Alfresco, which has access to the services via a BeanFactory. So to the source code I went and figured out how they set that up (javadoc and source for

JBPMSpringActionHandler). With this knowledge, I created the following abstract class to extend all my Alfresco task controllers from:

package com.burris.common.bpm.taskcontrollers;

import org.jbpm.taskmgmt.def.TaskControllerHandler;
import org.springframework.beans.factory.BeanFactory;
import org.springframework.beans.factory.access.BeanFactoryLocator;
import org.springframework.beans.factory.access.BeanFactoryReference;
import org.springmodules.workflow.jbpm31.JbpmFactoryLocator;

@SuppressWarnings("serial")
public abstract class AlfrescoTaskControllerHandler implements TaskControllerHandler {

 public AlfrescoTaskControllerHandler() {
  
  BeanFactoryLocator factoryLocator = new JbpmFactoryLocator();
  BeanFactoryReference factoryReference = factoryLocator.useBeanFactory( null );
  BeanFactory factory = factoryReference.getFactory();
  initializeHandler( factory );
 }
 
 protected abstract void initializeHandler( BeanFactory factory );
}

The initializeHandler function is intended to grab and store references to the services required. Now I am ready to create my actual task handler, called DocumentSubtypeTaskController:

package com.burris.common.bpm.taskcontrollers;

import java.util.ArrayList;
import java.util.Collection;
import java.util.List;

import org.alfresco.repo.workflow.jbpm.JBPMNode;
import org.alfresco.service.cmr.dictionary.ClassDefinition;
import org.alfresco.service.cmr.dictionary.DictionaryService;
import org.alfresco.service.cmr.repository.ChildAssociationRef;
import org.alfresco.service.cmr.repository.NodeRef;
import org.alfresco.service.cmr.repository.NodeService;
import org.alfresco.service.namespace.NamespaceService;
import org.alfresco.service.namespace.QName;
import org.apache.log4j.Logger;
import org.jbpm.context.exe.ContextInstance;
import org.jbpm.graph.exe.Token;
import org.jbpm.taskmgmt.exe.TaskInstance;
import org.springframework.beans.factory.BeanFactory;

@SuppressWarnings( "serial" )
public class DocumentSubtypeTaskController extends AlfrescoTaskControllerHandler {

 public DocumentSubtypeTaskController() {}
 
 @Override
 protected void initializeHandler( BeanFactory factory ) {
  // TODO Auto-generated method stub
  
 }
 
 @Override
 public void initializeTaskVariables( TaskInstance taskInstance, ContextInstance contextInstance, Token token ) {
  // TODO Auto-generated method stub
  
 }
 
 @Override
 public void submitTaskVariables( TaskInstance taskInstance, ContextInstance contextInstance, Token token ) {
  // TODO Auto-generated method stub
  
 }
}

I first implemented my initializeHandler method and added the services to the class:

...
 
 private NodeService nodeService;
 private DictionaryService dictionaryService;
 private NamespaceService namespaceService;

 ...

 @Override
 protected void initializeHandler( BeanFactory factory ) {
  
  this.nodeService = (NodeService) factory.getBean( "nodeService" );
  this.dictionaryService = (DictionaryService) factory.getBean( "dictionaryService" );
  this.namespaceService = (NamespaceService) factory.getBean( "namespaceService" );
 }

Next I implemented the two methods from the TaskControllerHandler interface: initializeTaskVariables and submitTaskVariables. The purpose of these methods is to setup the variables of the task and to retrieve their values to process them, respectively. The values can come from just about anywhere, but usually from the process context. In this case, I need to retrieve the type of the document node in the bpm_package, which is in the the process context.

...
 
 private static final String DOCUMENT_TYPE = "blndwf_documentType";
 
 ...
 
 @Override
 public void initializeTaskVariables( TaskInstance taskInstance, ContextInstance contextInstance, Token token ) {
  
  // Get the document from the workflow package
  Object object = contextInstance.getVariable( "bpm_package" );
  if ( object == null ) return;
  
  NodeRef bpmPackageNodeRef = ((JBPMNode) object).getNodeRef();
  if ( bpmPackageNodeRef == null ) return;
  
  List children = nodeService.getChildAssocs( bpmPackageNodeRef );
  ChildAssociationRef childNodeRef = children.get( 0 );
  if ( childNodeRef == null ) return;
  
  NodeRef documentNodeRef = childNodeRef.getChildRef();
  if ( documentNodeRef == null ) return;
  
  // Get the document's subtypes 
  QName documentQName = nodeService.getType( documentNodeRef );
  Collection subTypes = dictionaryService.getSubTypes( documentQName, false );
  
  List options = new ArrayList();
  for ( QName type : subTypes ) {
   
   ClassDefinition classDef = dictionaryService.getClass( type );
   options.add( type.getPrefixString() + "|" + classDef.getTitle() );
  }
  taskInstance.setVariable( DOCUMENT_TYPE, options );

A couple things you need to note here. The constant DOCUMENT_TYPE value is using an underscore (_). The actual property (which is coming from the workflow task model definition) looks like this: blndwf:documentType. The reason for this is that jBPM does not allow colons (:) in the variable names. I'm not sure which piece does this, but using the underscore automatically maps it to the correct property in the task model.

The other thing to note is the for loop. I setup a list of strings containing the type string and and the title separated by the pipe symbol (|). At the end of the day this list is converted to a string before it gets to the UI. I have created a custom form control that will split these strings (splits on "," and then "|") to use in a select box. The type gets mapped to the value of an option while the title is what is actually displayed. There was no other easy way to do this without string manipulation in the UI.

Lastly, I completed the submitTaskVariables method:

@Override
 public void submitTaskVariables( TaskInstance taskInstance, ContextInstance contextInstance, Token token ) {
  
  // Get the document from the workflow package
  Object object = contextInstance.getVariable( "bpm_package" );
  if ( object == null ) return;
  
  NodeRef bpmPackageNodeRef = ((JBPMNode) object).getNodeRef();
  if ( bpmPackageNodeRef == null ) return;
  
  List children = nodeService.getChildAssocs( bpmPackageNodeRef );
  ChildAssociationRef childNodeRef = children.get( 0 );
  if ( childNodeRef == null ) return;
  
  NodeRef documentNodeRef = childNodeRef.getChildRef();
  if ( documentNodeRef == null ) return;
  
  String selectedType = (String) taskInstance.getVariable( DOCUMENT_TYPE );
  if ( log.isDebugEnabled() ) log.debug( selectedType );
  
  // TODO: Validate user input
  
  // Set the document type if valid
  QName newType = QName.createQName( selectedType, namespaceService );
  nodeService.setType( documentNodeRef, newType );
  
  // TODO: Signal only when data is valid
  token.signal();
 }

After this is done processing the task variables (often just get mapped to the context instance, except not in this case - I actually change the document type) token.signal() needs to be called. This tells jBPM that the task controller is done and the workflow is ready to transition to the next task in the workflow.

That's it for the Java code! This can be jar'd up and put into the lib directory (alfresco.war/WEB-INF/lib). Here is the final class in it's entirety:

package com.burris.common.bpm.taskcontrollers;

import java.util.ArrayList;
import java.util.Collection;
import java.util.List;

import org.alfresco.repo.workflow.jbpm.JBPMNode;
import org.alfresco.service.cmr.dictionary.ClassDefinition;
import org.alfresco.service.cmr.dictionary.DictionaryService;
import org.alfresco.service.cmr.repository.ChildAssociationRef;
import org.alfresco.service.cmr.repository.NodeRef;
import org.alfresco.service.cmr.repository.NodeService;
import org.alfresco.service.namespace.NamespaceService;
import org.alfresco.service.namespace.QName;
import org.apache.log4j.Logger;
import org.jbpm.context.exe.ContextInstance;
import org.jbpm.graph.exe.Token;
import org.jbpm.taskmgmt.exe.TaskInstance;
import org.springframework.beans.factory.BeanFactory;

@SuppressWarnings( "serial" )
public class DocumentSubtypeTaskController extends AlfrescoTaskControllerHandler {

 private static final String DOCUMENT_TYPE = "blndwf_documentType";
 
 private NodeService nodeService;
 private DictionaryService dictionaryService;
 private NamespaceService namespaceService;
 
 public DocumentSubtypeTaskController() {}
 
 @Override
 protected void initializeHandler( BeanFactory factory ) {
  
  this.nodeService = (NodeService) factory.getBean( "nodeService" );
  this.dictionaryService = (DictionaryService) factory.getBean( "dictionaryService" );
  this.namespaceService = (NamespaceService) factory.getBean( "namespaceService" );
 }
 
 @Override
 public void initializeTaskVariables( TaskInstance taskInstance, ContextInstance contextInstance, Token token ) {
  
  // Get the document from the workflow package
  Object object = contextInstance.getVariable( "bpm_package" );
  if ( object == null ) return;
  
  NodeRef bpmPackageNodeRef = ((JBPMNode) object).getNodeRef();
  if ( bpmPackageNodeRef == null ) return;
  
  List children = nodeService.getChildAssocs( bpmPackageNodeRef );
  ChildAssociationRef childNodeRef = children.get( 0 );
  if ( childNodeRef == null ) return;
  
  NodeRef documentNodeRef = childNodeRef.getChildRef();
  if ( documentNodeRef == null ) return;
  
  // Get the document's subtypes 
  QName documentQName = nodeService.getType( documentNodeRef );
  Collection subTypes = dictionaryService.getSubTypes( documentQName, false );
  
  List options = new ArrayList();
  for ( QName type : subTypes ) {
   
   ClassDefinition classDef = dictionaryService.getClass( type );
   options.add( type.getPrefixString() + "|" + classDef.getTitle() );
  }
  taskInstance.setVariable( DOCUMENT_TYPE, options );
 }

 @Override
 public void submitTaskVariables( TaskInstance taskInstance, ContextInstance contextInstance, Token token ) {
  
  // Get the document from the workflow package
  Object object = contextInstance.getVariable( "bpm_package" );
  if ( object == null ) return;
  }
  
  NodeRef bpmPackageNodeRef = ((JBPMNode) object).getNodeRef();
  if ( bpmPackageNodeRef == null ) {
   
   log.fatal( "nodeRef is null." );
   return;
  }
  
  List children = nodeService.getChildAssocs( bpmPackageNodeRef );
  ChildAssociationRef childNodeRef = children.get( 0 );
  if ( childNodeRef == null ) return;
  
  NodeRef documentNodeRef = childNodeRef.getChildRef();
  if ( documentNodeRef == null ) return;
  
  String selectedType = (String) taskInstance.getVariable( DOCUMENT_TYPE );
  if ( log.isDebugEnabled() ) log.debug( selectedType );
  
  // Validate user input
  
  // Set the document type if valid
  QName newType = QName.createQName( selectedType, namespaceService );
  nodeService.setType( documentNodeRef, newType );
  
  // Signal only when data is valid?
  token.signal();
 }
}

Saturday, January 30, 2010

Concurrent Programming Concepts

Introduction
Writing linear programs, while being a lot easier than concurrent programming, is a thing of the past. Linear programming was effective while Moore's Law held; however, as raw processing speed has currently hit a brick wall we have had to look at executing code in parallel. With more and more systems being built with multiple processors with multiple cores each, not taking advantage of them is poor programming. This limits the scalability, responsiveness, and performance of any program. To make things worse, many that code using Java still operate linearly even though threads permeate Java's system. Servlets operate using threads and can be concurrently accessed by multiple requests from users, GUI frameworks use threads underneath to retain responsiveness, the garbage collector operates in separate threads of it own, and etc. They are everywhere. If a program is being created without concurrency in mind, you run the great risk of inadvertently creating errant code because even if you yourself don't use multiple threads, that doesn't mean there aren't any there to access it. If there are multiple threads accessing the same mutable (changeable) variables, and those variables aren't properly synchronized, your program is broken. End of story. It may look like your code works initially, and it may work correctly for years to come, but it all amounts to lucky/unlucky timing. The worst time to see that the code is broken is when it fails in production under heavy load (and no doubt that is when it will happen because of the system having so much going on).

State
Concurrent programming is not defined by using threads and locks; rather, it is all about maintaining state. Threads and locks are the building blocks that we use to maintain state with concurrent access. An object's state is encompassed by the instance variables, static variables, or other dependent objects. When we talk about thread-safety, it may sound like we are talking about code, but what we are really saying is that we are trying to protect data by controlling concurrent access to the data.

Ways to Manage It
There are three primary ways in Java to maintain a class's state:

Don't share state across threads (encapsulation).

Make state variables immutable (unchangeable).

Use synchronization techniques whenever accessing state variables.

It is important to remember to encapsulate synchronization inside an object so that when anything uses it, it doesn't have to worry about implementing its own synchronization outside of the object for the object. Stateless objects are always thread-safe, because there is nothing it has to remember or keep track of during its execution. If a single element of state is added to a stateless class, then it will be thread-safe if and only if the state is fully managed by a thread-safe object. When there are multiple variables that comprise an object's state, it is important to know which ones are dependent upon each other, and which ones that aren't. If there are 2 or more variables that are related in some fashion are accessed by more than one thread could result in the 2 variables getting out of sync (meaning the data contained within them doesn't match up). If the variables are independent of each other, the synchronization may be exclusive to each variable. This means that each one can be read, modified, or written to without worrying what the others are.

Final Thought
Concurrent programming is difficult at first, but it also not going away... ever. The only way to get better at is to read about it, read it again, and do it until my head and fingers hurt. Repetition is the mother of knowledge. That is the only way this will get any easier.

Thursday, January 28, 2010

Interrupting Thread Execution

In one of my programming classes, we had to write a multi-threaded chat server that could accept an arbitrary number of clients and transmit messages from one to another. We knew little about concurrency at the time, so we didn't know how to stop a thread from executing. Because we were young and aspiring students that were motivated to get a project done by a certain deadline, we went where we always go when looking for an answer... Google (and the professor)! This is basically what we found from many different sites, forums, and blog posts (I think my professor agreed/told us to do it this way too):

http://www.roseindia.net/javatutorials/shutting_down_threads_cleanly.shtml

While this will work, this is not the proper or best way to stop a thread and can cause problems with responsiveness, liveness, deadlocks, etc as we will see in later posts. So here is the general idea of the bad example:

package org.tiki.threadexamples;


public class BadThreadStopExample extends Thread {

 private boolean continueRunning = true;

 public void run() {
 
  while ( continueRunning ) {
   
   // Do stuff
   System.out.println( "Bad thread is doing its thing." );
  }
 }
 
 public void cancel() {
  
  continueRunning = false;
 }
 
 public static void main( String[] args ) {
  
  BadThreadStopExample thread = new BadThreadStopExample();
  thread.start();
  
  // Do stuff
  
  thread.cancel();
 }
}

So what is the correct way of handling these things? I'm glad you asked. I've recently been reading the book Java Concurrency in Practice by Brian Goetz and I highly recommend it. After reading a few chapters, it boggles my mind as to why there are so many examples on the internet that follow the above paradigm. I am guilty of writing similar code in most of my multithreaded programs, so if you've done this, you're not alone. I have been enlightened on how this should be written (thank you Brian Goetz)... the thread interruption methods!

public void interrupt()

public boolean isInterrupted()

public static boolean interrupted()

Using these methods, we can change the above code into this:

package org.tiki.threadexamples;


public class GoodThreadStopExample extends Thread {

 public void run() {
  
  while ( !Thread.currentThread().isInterrupted() ) {
   
   // Do stuff
   System.out.println( "Good thread is doing its thing." );
  }
 }
 
 public void cancel() {
  
  interrupt();
 }
 
 public static void main( String[] args ) {
  
  GoodThreadStopExample thread = new GoodThreadStopExample();
  thread.start();
  
  // Do stuff
  
  thread.cancel();
 }
}

So this code now sends an interrupt message to the thread, and the while loop now checks that message instead of defining your own variable. However, if blocking functions are in the loop, such as serverSocket.accept() or in.readLine(), the thread may not behave as you think it would. Interrupt methods don't always cause blocking functions to stop what they are doing. How to handle these cases is a bit different and what you do to handle them can depend on the situation and the requirements of the program... So that is for another post.