Monday, October 13, 2014

mysql prepared statements caching

PreparedStatement is cached inside the J2EE server connection pool manager. The J2EE server keeps a list of prepared statements for each database connection in the pool. When an application calls prepareStatement on a connection, the application server checks if that statement was previously prepared. If it was, the PreparedStatement object will be in the cache and this will be returned to the application. If not, the call is passed to the jdbc driver and the query/preparedstatement object is added in that connections cache.

We need a cache per connection because that's the way jdbc drivers work. Any preparedstatements returned are specific to that connection.

If we want to take advantage of this cache,We need to use parameterized queries so that they will match ones already prepared in the cache. Most application servers will allow you to tune the size of this prepared statement cache.

PHP/MYSQL

So there are good reasons to use prepared statements:
  • Save on query parsing
  • Save on data conversion and copying
  • Avoid SQL Injection
  • Save memory on handling blobs
There are also drawback of using prepared statements:
  1. Query cache does not work
  2. Extra server round trip required if statement used only once
  3. Not all statements can be prepared. So you can’t use prepared API exclusively you’ll need to fall back to normal API for some statements
  4. Newer and sometimes buggy code. I had a lot of problems with PHP prepared statements. It is getting better but still it is less mature than standard API
  5. You can’t use placeholders in place of all identifiers. For example you can’t use them for table name. In certain version it even does not work for LIMIT boundaries
  6. Inconvenient list handling. Unlike in for example PEAR emulated prepard statements there is no nice way to pass list of values to IN
  7. Harder tracing. Logs were now fixed to include full statement text not only “Execute” but in SHOW INNODB STATUS you would still see statements without actual values – quite inconvenient for analyses.
For a prepared statement executed via the binary protocol, comparison with statements in the query cache is based on the text of the statement after expansion of ? parameter markers. The statement is compared only with other cached statements that were executed via the binary protocol. That is, for query cache purposes, statements issued via the binary protocol are distinct from statements issued via the text protocol.

We can't re-use a mysql statement prepared during a previous request in php.

If  PHP application uses connection pooling to the database, and the database caches prepared statements, then yes, the caching will persist between pages. If the prepared statement caching is done by the client library, then this is more nebulous.

Reusing the same variable name in PHP won't invalidate the MySQL prepare "cache".

$stmt = $dbh->prepare("SELECT column_A FROM Table1 WHERE id=?");
$stmt->bindValue(1, $id, PDO::PARAM_INT);
$stmt->execute();

$stmt = $dbh->prepare("UPDATE Table2 SET column_B=? WHERE column_A=?");
$stmt->bindValue(1, $name);
$stmt->bindValue(2, $column_A);
$stmt->execute();

MYSQL persistent connection

Mysql server offer statement caching.

mysql> PREPARE stmt_name FROM "SELECT name FROM Country WHERE code = ?";
Query OK, 0 rows affected (0.09 sec)
Statement prepared


mysql> SET @test_parm = "FIN";
Query OK, 0 rows affected (0.00 sec)


mysql> EXECUTE stmt_name USING @test_parm;
+---------+
| name    |
+---------+
| Finland |
+---------+
1 row in set (0.03 sec)


mysql> DEALLOCATE PREPARE stmt_name;

Query OK, 0 rows affected (0.00 sec)

When a request is served php "cleans" the instance and frees resources and other variables. This is done in several steps. Since apache keeps the process alive after a request not all steps are executed and not all memory is freed. There is e.g. EG(persistent_list) which is used by mysql_pconnect(), pg_pconnect(), ... This list isn't emptied between requests as long as the process keeps alive (could be, depending on the actual implementation, but that would defy the purpose of EG(persistent_list)). If you use persistent connections your script might get a "re-used" connection established during a previous request.
To (re-)use a prepared statement directly you need the identifier for that statement (and that connection). When using (php-)postgresql this is simply a (connection-wise) unique string you pass to pg_execute(), so your script has no problem to gain access to the statement previously prepared by another instance (using the same connection).

Using mysqli or PDO-mysql you need a resource/object as statement identifier. That's kind of a problem since neither the mysqli nor the pdo extension seem to offer a way of storing the resource in EG(persist_list) between requests and you can't recreate it either, so it's seems impossible to re-use a mysql prepared statement directly.

But as we know we should not use persistent connection , here are some good points.

  1. When you lock a table, normally it is unlocked when the connection closes, but since persistent connections do not close, any tables you accidentally leave locked will remain locked, and the only way to unlock them is to wait for the connection to timeout or kill the process. 
  2. Normally temporary tables are dropped when the connection closes, but since persistent connections do not close, temporary tables aren't so temporary. If you do not explicitly drop temporary tables when you are done, that table will already exist for a new client reusing the same connection. The same problem occurs with setting session variables. 
  3. If PHP and MySQL are on the same server or local network, the connection time may be negligible, in which case there is no advantage to persistent connections.
  4. Apache does not work well with persistent connections. When it receives a request from a new client, instead of using one of the available children which already has a persistent connection open, it tends to spawn a new child, which must then open a new database connection. This causes excess processes which are just sleeping, wasting resources, and causing errors when you reach your maximum connections, plus it defeats any benefit of persistent connections.
Also Try to use it smartly. For example.

$ids = array(12,34,56);

  $sql = “SELECT * FROM address WHERE address_id = :address_id”;
  $databaseHandle = CustomDB::getDBH();
  $rows = array();
  $preparedStatementHandle = $databaseHandle->prepare($sql);
  foreach ($ids as $id) {
    $preparedStatementHandle->execute(array(‘address_id’ => $id));
    $result = $preparedStatementHandle->fetchAll();
    $rows = array_merge($rows, $result);
  }

It doesn't make sense when we can get gain using SELECT id FROM address WHERE address_id IN (12,34,56);


Prepared Statements are useful when they are related to session-long caching alongside a database wrapper that’s just good enough to programmatically generate most typical lookups.


Unfortunately, “good enough” caching looks complicated:


Example Queries:


  SELECT * FROM address WHERE address_id = 12;


  SELECT * FROM address WHERE address_id = 34;


  SELECT * FROM address WHERE address_id = 56;


Example Code, assuming PDO & PHP, to produce those queries using Prepared Statements with caching:


  $ids = array(12,34,56);


  $sql = “SELECT * FROM address WHERE address_id = :address_id”;


  $databaseHandle = CustomDB::getDBH();


  Cache::setStrategy(Cache_LRU::getLabel());


  $rows = array();


  if (!Zend_Registry::isRegistered(self::PREP_STMT_CACHE_KEY)) {


    Zend_Registry::set(self::PREP_STMT_CACHE_KEY, Cache::getInstance());


  }


  $preparedStatementHandle = Zend_Registry::get(self::PREP_STMT_CACHE_KEY)->get($sql);


  if (!$preparedStatementHandle) {


    $preparedStatementHandle = $databaseHandle->prepare($sql);


    // Use the sql itself as the index/hash


    Zend_Registry::get(self::PREP_STMT_CACHE_KEY)->set($sql, $preparedStatementHandle);


}


  foreach ($ids as $id) {


    $preparedStatementHandle->execute(array(‘address_id’ => $id));


    $result = $preparedStatementHandle->fetchAll();


    $rows = array_merge($rows, $result);


  } 

Prepared Statements fail
  1. Prepared Statements only exist for the current session, so holding onto a handle after a session closes will lead to failures
  2. Each Prepared Statement consumes a handle from the instance’s Prepared Statement pool, which is the “max_prepared_stmt_count”
  3. Out Of Memory on the client side
Case 1:
A connection closes while a cache of Prepared Statement Handles exists

Solution:
Update your PDO wrapping class to have a __destruct method defined to clear the relevant cache of Prepared Statement Handles before calling the parent’s destruct method.

Case 2:
The max_prepared_stmt_count value is reached on a database

Solution:
Immediately drop all local caches of Prepared Statements and try again. If there is still an issue, activate PDO’s ATTR_EMULATE_PREPARES flag to silently convert calls of ->prepare and ->exec into standard SQL Statements.

Case 3:
Out Of Memory (OOM) on the client-side.

Solution:
Reduce the Prepared Statement Handle Cache size. The cache does not have to be large if it is well managed. Even my company’s complicated webapp’s web requests do not fill a 200-statement-long FIFO cache.

Remaining Points:
  • Monitor (ie: Nagios) Prepared_stmt_count vs max_prepared_stmt_count
  • Monitor (ie: StatsD) the Prepared Statement Handle Cache hit, miss, and purge rate.
  • An LRU’s extra minimal overhead is only worthwhile over a simple FIFO if your Prepared Statement Handle Cache is too small for all the queries that should be cached.
  • Note: Your cache should be small due to the unbounded and invisible memory consumption of Prepared Statement Handles on the database server
  • A best-case Prepared Statement instantiation against localhost with a simple select costs me, on average, about 300 microseconds. Pulling a handle from a cache is about 6 microseconds.
  • Coworkers have shown me that long lists of Named Parameters (ie: “:id1, :id2, :id3, [...]“) get more expensive with quantity whereas long lists of Ordered Parameters (ie: “?,?,?,[...]“) remain cheap even in large number. Numerically quantifying this slowdown will be a future post.
  • Ordered Parameters’ values are not decoded in SHOW PROCESSLIST. Named Parameters’ values are displayed, however, which makes them, to me, far preferable. 



Wednesday, October 8, 2014

smooth mobile scrolling

All GUIs generally work the same way.  There is a main thread with a loop that processes messages from a queue.  Messages can range from "move view to this location" or "user has performed a touch at location".  The whole point is that it is a queue so every message generally gets processed one at a time and in a first come first serve order.

For the majority of UI toolkits, including those found on iOS and Android, accessing and modifying objects must be done in the main thread.  Despite sometimes being called the UI thread, it is usually also the main thread and often is responsible for not just painting, changing colors, moving objects but also for loading files, decoding images, handling network responses etc.

In Android, if you want to animate an object and make it move an object from location1 to location2, the animation API figures out the intermediate locations (twinning) and then queues onto the main thread the appropriate move operations at the appropriate times using a timer.  This works fine except that the main thread is usually used for many other things -- painting, opening files, responding to user inputs etc.  A queued timer can often be delayed. Well written programs will always try to do as many operations as possible in background (non main) threads however you can't always avoid using the main thread.  Operations that require you to operate on a UI object always have to be done on the main thread.  Also, many APIs will funnel operations back to the main thread as a form of thread-safety. It is usually almost impossible to keep all operations on the main thread down to 1/60th of a second in order to allow animations to be processed smoothly.  Even if Google could manage to get their code to do just that, it doesn't mean third party Application writers will be able to.

In iOS operations on UI objects also must be done on the main thread with the exception of animation operations done via CoreAnimation.  CoreAnimation runs on a background thread and is able to directly manipulate, move, recolor and reshape UI objects on a background (CoreAnimation) thread.  Compositing, rendering is also performed in this thread.  It does this through a combination of hardware and software, providing very smooth and fast animations.  From the main thread you can basically issue a call to CallAnimation and tell it to move object1 from location1 to location2.  This animation will continue to run even if the main thread is blocked performing another operation.  This is why animations will almost never stutter on iOS.The main thread manages application data and UI application state (UI application state includes things such as the strings to be displayed in a ListView etc) but issues physical UI state change requests to a separate and dedicated high priority CoreAnimation thread (physical states include things such as color, position and shape). All physical state changes can be animated and CoreAnimation will also perform the twinning for you (like the Android animation APIs).  Non animated physical state changes will be issued directly by CoreAnimation  and the main thread (not the CoreAnimation thread) will block until those are performed.  Animated physical state changes that are issued by the main thread will be performed asynchronously by the CoreAnimation thread. Because physical UI state and only physical UI state is managed by the CoreAnimation thread, the main thread can be blocked or busy but the CoreAnimation thread will still continue to not only accurately render the last known state of the UI (as issued by the main thread) but also continue to render any pending or incomplete animated UI physical state changes as requested by the main thread.

In Windows Vista, Microsoft introduced desktop composition whereby the OS maintained a separate pixel buffer for every window.  This meant that even if an application hung, the last state of the window (how it looked) is still rendered rather than just being drawn as white (the OS partially managed the state of the pixels in the window).  CoreAnimation goes beyond this and offloads much of the UI work traditionally managed by the main thread including managing the state of not just the pixels (like Vista) but of higher level concepts such as widgets, widget locations, widget colors etc.

At Android animation model, It's the way many toolkits work including Flash which was definitely very animation heavy.  I would say the iOS model makes the overall user experience nicer and offloads one more worry for the developer back to the operating system.  I'm sure Google will continue to recognize the importance of animation on touch screen devices and continue to accelerate (or re architecture) Android in coming releases.

A 5 year old 1st generation iPhone will perform smoother and more reliable animations than the latest quad core Samsung Android phone. It's a software design problem and not something you can throw more cores at (not least of which because the main thread will only ever run on one core!). Don't believe people when they excuse stutter and lag as "oh just the Android Java garbage collector".  Modern compacting, generational garbage collectors generally aren't the cause of the kind of stutter you see on Android.


Let’s not forget that Android does true multitasking and background processes execution, unlike iOS, which adds to the overhead. Also adding to the overhead: Native iOS apps are binaries pre-compiled for their own hardware while Android uses the Dalvik virtual machine with just-in-time compilation to run Dalvik dex-code (Dalvik Executable), which is usually translated from Java bytecode.

Kinetic scrolling

Kinetic scrolling is the combination of regular, drag-finger-on-screen scrolling with an additional movement after the finger is lifted off the screen.Based on how fast the finger was dragged on the screen, the duration, speed and deceleration of the additional movement can vary.

kinetic scrolling can be viewed as the sum of two features, it can be implemented in two steps.

The first step is click & drag scrolling. It can be achieved by installing an event filter and intercepting mouse press, move and release events. When a press event is received the scrolling starts, when a move event is received the list is scrolled, and finally when a release event is received the scrolling stops. To avoid accidental clicks, all the events are blocked inside the filter function.

For step two, the scroller continues to scroll the list automatically after the user has lifted their finger off the screen, gradually slows down and then stops. To display a pleasing effect, the scroller must decide how fast to scroll, how far to scroll and how fast to slow down.

A good starting point is "how fast to scroll". In physics velocity represents the direction in which and magnitude by which an object changes its position. Speed is another word for magnitude in this context. The "how fast to scroll" question can be answered by recording the cursor’s drag velocity on the screen. A simple but imprecise way to do this is to poll the cursor position at specific time intervals; the difference in positions represents the speed (measured in pixels / timer interval) and the mathematical sign of the difference represents the direction. This algorithm will give a good enough idea on whether the cursors is moving fast or slow.

Next up is "how far to scroll".How far is actually connected to how fast to slow down because the list is scrolled with a certain velocity and then it decelerates until it stops. Since the velocity has previously been established, the only thing left is to calculate the deceleration based on friction. In physics, kinetic friction is the resistance encountered when one body is moved in contact with another. Of course, there can be no friction between pixels, but kinetic scrolling is a simulation and one can pretend that the list items are moving over the list container and that this movement generates friction. In reality friction is calculated based on the nature of the materials, mass, gravitational force and so on. In the simulation a numeric value is used to alter the speed of scrolling.

Having determined the deceleration,"how far" the list scrolls kinetically is simply a function of the time that it needs to reach a speed of zero.

Final Steps:
  • Need to measure the velocity of finger cursor.
  • Implement a simple particle physics loop. information about how to do that here
  • give your particle "bounds" using math derived from the width of your scrolling plane, and the width of your viewport
  • continuously Add the the difference between the mouse velocity and the particle velocity, to the particle's velocity, so the particle's velocity "matches" the mouse's velocity for as long as it's moving.
  • Stop doing step 4 as soon as the user lifts their finger. The physics loop takes care of inertia.
  • Add your personal flourishes such as "bumper" margins, and smooth scrolling "anchor" points
Basic scrolling functionality: This task consisted in handling drag events in the simplest way, which is by scrolling the contents according to the drag distance and direction. This was relatively straightforward to implement, the only tricky aspect was to know when to start a drag operation vs. when to pass down a tap event to a child widget inside the scrollable area.

Scroll inertia: This one was the most challenging. The idea here is that scrolling should continue for some time after the user lifts the finger, slowing down until it stops completely. For this I needed to have an idea of the scroll velocity. Unfortunately it is not accurate to compute the velocity from a single sample, so while the user is scrolling I record the last N motion events in a circular buffer, along with the time at which each event occurred. I found N=4 to work just fine on the iPhone and on the HP TouchPad. When the finger is lifted I can compute an approximate start velocity for the inertial scrolling from the recorded motion. I defined a negative acceleration coefficient and used standard motion formulas (see here) to let the scrolling die down nicely. If the scroll position reaches a border while still in motion I just reset the velocity to 0 to prevent it from going out of range (the abrupt stop is addressed next).

Flexible scrolling limits: instead of going into an abrupt stop when the scroll reaches the end I wanted the widget to scroll some, but offering resistance. For this I extended the allowed scroll range on both ends by an amount that I defined as a function of the widget dimensions. I've found that adding half the width or height on each end worked nicely. The trick to give the scrolling the feeling that it is offering some resistance was to adjust the displayed scroll positions when they are out of range. I used a scaling down plus a deceleration function for this (there are some good easing functions here).

Spring behavior: since now it is possible to scroll past the valid range, I needed a way to bring the scroller back to a valid position if the user left it out of range. This is achieved by adjusting the scroll offset when the scroller comes to a stop at an out of range position. The adjustment function that I've found to give a nice springy look was to divide the distance from the current position to the desired position by a constant and moving the offset by that amount. The bigger the constant the slower motion.

Scrollbars: the final touch was to add overlay scrollbars, which fade in when scrolling starts and fade out when it ends.

Quick list for android:
  • Reduce the number of conditions used in the getView of your adapter.
  • Check and reduce the number of garbage collection warnings that you get in the logs
  • If you're loading images while scrolling, get rid of them
  • Set scrollingCache and animateCache to false (more on this later)
  • Simplify the hierarchy of the list view row layout
  • Use the view holder pattern
  • Use dragging proper way (is the type of scrolling that occurs when a user drags her finger across the touch screen Simple dragging is often implemented by overriding onScroll() in GestureDetector.OnGestureListener.)
  • Use flinging proper way (is the type of scrolling that occurs when a user drags and lifts her finger quickly. After the user lifts her finger, you generally want to keep scrolling (moving the viewport), but decelerate until the viewport stops moving. Flinging can be implemented by overriding onFling() in GestureDetector.OnGestureListener, and by using a scroller object.)