Archive for category parallel language programming

Some undergrad projects in bridging efficiency & productivity layers

Relative to our current work on SEJITS, autotuning, and “frictionless high performance software”:

  • Start an autotuning DB for use by SEJITS as well as manual use. Challenge is to determine a schema for this info that could be used both for human queries and machine queries (eg via XMLRPC). Each time an autotuning parameter set is determined, add it to the DB.
  • Use Archana’s and Kristal’s KCCA algorithms as as test case for “frictionless”.  They are sparse-matrix eigenvalue solver problems.
  • SEJITS: take Andrew Ng et al’s paper on mapping a variety of SML algorithms to “summation form” for GPU execution, and apply SEJITS to those computations.
  • SEJITS: look at LAWN 223 (Cholesky factorization on GPU) and encapsulate it in a specializer.

STM panel at MS Research Fac Summit 07

There’s no consensus currently in language community on how to
“approach” STM: libraries only? extend existing lang? new lang? use
it to improve implementation of existing abstractions in existing
langs, or need new lang to express abstractions uniquely matched to the
use of STM?

TRANSACT workshops at PODC, PPoPP may be worth keeping an eye on from
PARlab perspective. (www.cs.rochester.edu/meetings/TRANSACT07) ;
TRANSACT08 will be colocated with HPCA-14/PPoPP in Salt Lake City

Christos: features like iterators, retry, abstract locks, etc being
exposed to HLLs and implemented using STM. (Analogy: datbase
“programmers” think about atomicity but program in SQL)

Implicit hypothesis: STM programming langs/models will be declarative,
more like SQL than like Java

I/O within STM: better idea might be system level txns, coordinate txns
across multiple resources. (Quicksilver did this but was hairy)

Christos: not enough exp. yet with STM to have tried to solve all the
parallelism problems it could potentially solve.

Good news; STM useful for more than managing concurrency. eg: debugging
(deterministic replay), profiling/tuning (contention/locality
monitoring), security (isolation), F/T…maybe these will be “killer
apps” for STM? (=> maybe, but doesn’t nec. help manycore research)

Maurice Herlihy: interested in integrating STM with *managed* languages
(implicit memory mgt, bounds checking, etc) => discuss synchronization
at object level, not addresses or other low-level structures

Declare a class or obj as ‘atomic’- compiler (Phoenix) does whole-object
dataflow analysis to figure out where STM can be used to enforce

“Libraries either low-oevrhead but hard to use, or inefficient but easy
to use; compiler support combines best of both worlds including the
opportunity for nonlocal optimization”

Mike Scott, U Rochester: RSTM library (result of 3+ yrs doing TM
libraries for Java, C…) + Delaunay mesh app

“Smart pointer” API (callback at initialize & dereference; design
pattern from C++). Needed because you want to be able to distinguish
between initial & subsequent accesses (in a STM environment).

Can also check for conditions like trying to access a txnl object
outside of txn code.

“Programming model provided by library doomed to be too cumbersome for
naive users…compiler support will be essential for
semantics/programmer understandability as well
as performance”

“Awkwardnesses” such as having to define accessors (in C++; C# avoids
this, as does Ruby)

Can’t share code templates between txnl and non-txnl code (due to
explicit labeling of txnl types) – so can’t share code. (Ruby “duck
typing” might solve this!)

Can’t use this-> dereference for typing reasons (have to explicitly pass
a smartpointer to a method….ugh)

These things could be fixed in a compiler – but not in a library.
(except in Ruby, whre Modules might serve?)

Deeper challenges that can’t be plastered over by compiler:

What reverts when you abort a txn? In library approach, only the things
labeled as txn are rolled back. Other stuff doesn’t. Major gotcha
because goes against your intuition of how txn abort “should” work.

Class-level txn typing probably will be obsoleted by per-object.

FORTRESS – a new lang design (from Sun?) that emphasizes “high level
control constructs” like Mapreduce that are tightly integrated with
transactional support in those languages, but also designed as a
“FORTRAN replacement” for manycore scientific programming

http://www.pervasivedatarush.com/node/42

From a critique/summary of FORTRESS:
“I am not convinced Fortress will be a better foundation for DSELs than
a functional language like Scheme or Haskell. In both of these
languages, any code you write seems to be a very natural extension of
the provided base language. Fortress allows arbitrary specification of
its abstract syntax, but in a way that does not feel as natural and
simplistic. ”

From another commentary – this echoes my own view:

http://alblue.blogspot.com/2007/06/programming-languages-and-multi-core.html

“Maybe what’s needed is parallelism at a higher level. Instead of
insisting on a functional language through-and-through, it may be
possible to do some parts in a functional style but then break that (in
a controlled way) in smaller segments.”