Update on the InnoDB double-write buffer and EXT4 transactions

In a post, written a few months ago, I found that using EXT4 transactions with the “data=journal” mount option, improves the write performance significantly, by 55%, without putting data at risk. Many people commented on the post mentioning they were not able to reproduce the results and thus, I decided to further investigate in order to find out why my results were different.

So, I ran sysbench benchmarks on a few servers and found when the InnoDB double-write buffer limitations occur and when they don’t. I also made sure some of my colleagues were able to reproduce the results. Basically, in order to reproduce the results you need the following conditions:

  • Spinning disk (no SSD)
  • Enough CPU power
  • A dataset that fits in the InnoDB buffer pool
  • A continuous high write load with many ops waiting for disk

Using the InnoDB double write buffer on an SSD disk somewhat prevents us from seeing the issue, something good performance wise. That comes from the fact that the latency of each write operation is much lower. That makes sense, the double-writer buffer is an area of 128 pages on disk that is used by the write threads. When a write thread needs to write a bunch of dirty pages to disk, it first writes them sequentially to free slots in the double write buffer in a single iop and then, it spends time writing the pages to their actual locations on disk using typically one iop per page. Once done writing, it releases the double-write buffer slots it was holding and another thread can do its work. The presence of a raid controller with a write cache certainly helps, at least until the write cache is full. Thus, since I didn’t tested with a raid controller, I suspect a raid controller write cache will delay the apparition of the symptoms but if the write load is sustained over a long period of time, the issue with the InnoDB double write buffer will appear.

So, to recapitulate, on a spinning disk, a write thread needs to hold a lock on some of the double-write buffer slots for at least a few milliseconds per page it needs to write while on a SSD disk, the slots are released very quickly because of the low latency of the SSD storage. To actually stress the InnoDB double-write buffer on a SSD disk, one must push much more writes.

That leads us to the second point, the amount of CPU resources available. At first, one of my colleague tried to reproduce the results on a small EC2 instance and failed. It appeared that by default, the sysbench oltp.lua script is doing quite a lot of reads and those reads saturate the CPU, throttling the writes. By lowering the amount of reads in the script, he was then able to reproduce the results.

For my benchmarks, I used the following command:

sysbench --num-threads=16 --mysql-socket=/var/lib/mysql/mysql.sock
--mysql-database=sbtest --mysql-user=root
--test=/usr/share/doc/sysbench/tests/db/oltp.lua --oltp-table-size=50000000
--oltp-test-mode=complex --mysql-engine=innodb --db-driver=mysql
--report-interval=60 --max-requests=0 --max-time=3600 run

Both servers used were metal boxes with 12 physical cores (24 HT). With less CPU resources, I suggest adding the following parameters:

--oltp-point-selects=1
--oltp-range-size=1
--oltp-index-updates=10

So that the CPU is not wasted on reads and enough writes are generated. Remember we are not doing a generic benchmarks, we are just stressing the InnoDB double-write buffer.

In order to make sure something else isn’t involved, I verified the following:

  • Server independence, tried on 2 physical servers and one EC2 instance, Centos 6 and Ubuntu 14.04
  • MySQL provided, tried on MySQL community and Percona Server
  • MySQL version, tried on 5.5.37 and 5.6.23 (Percona Server)
  • Varied the InnoDB log file size from 32MB to 512MB
  • The impacts of the number of InnoDB write threads (1,2,4,8,16,32)
  • The use of Linux native asynchronous iop
  • Spinning and SSD storage

So, with all those verifications done, I can maintain that if you are using a server with spinning disks and a high write load, using EXT4 transactions instead of the InnoDB double write buffer yields to an increase in throughput of more than 50%. In an upcoming post, I’ll show how the performance stability is affected by the InnoDB double-write buffer under a high write load.

Appendix: the relevant part of the my.cnf

innodb_buffer_pool_size = 12G
innodb_write_io_threads = 8 # or else in {1,2,4,8,16,32}
innodb_read_io_threads = 8
innodb_flush_log_at_trx_commit = 0 # must be 0 or 2 to really stress the double write buffer
innodb_log_file_size = 512M # or 32M, 64M
innodb_log_files_in_group = 2
innodb_file_per_table
innodb_flush_method=O_DIRECT # or O_DSYNC
innodb_buffer_pool_restore_at_startup=300 # On 5.5.x, important to warm up the buffer pool
#innodb_buffer_pool_load_at_startup=ON # on 5.6, important to warm up the buffer pool
#innodb_buffer_pool_dump_at_shutdown=ON # on 5.6, important to warm up the buffer pool,
skip-innodb_doublewrite # or commented out
innodb_flush_neighbor_pages=none # or area for spinning

The post Update on the InnoDB double-write buffer and EXT4 transactions appeared first on MySQL Performance Blog.

Q&A: High availability when using MySQL in the cloud

Percona MySQL webinar followup: Q&ALast week I hosted a webinar on using MySQL in the cloud for High Availability (HA) alongside 451 Research analyst Jason Stamper. You can watch the recording and also download the slides (free) here. Just click the “Register” button at the end of that page.

We had several excellent questions and we didn’t have time to get to several of them in the allotted time. I’m posting them here along with the answers. Feel free to ask follow-up questions in the comments below.


Q: Can the TokuDB engine be used in a PXC environment?

A: No, TokuDB cannot currently be used in a PXC environment, the only supported engine in Percona XtraDB Cluster 5.6 is InnoDB.

Q: With Galera replication (PXC), is balancing the load on each node?

A: No, you need to implement your own load balancing and HA layer between your clients and the Percona XtraDB Cluster server.  Examples mentioned in the webinar include HAProxy and F5 BigIP.

Q: What’s the best version of Percona XtraDB Cluster regarding InnoDB performance?

A: In general for best performance you should be using the latest release of Percona XtraDB Cluster 5.6, which is currently 5.6.24, released on June 3rd, 2015.

Q: Can I redirect my writes in Percona XtraDB Cluster to multiple nodes using the HAProxy? While trying with SysBench I can see write-only goes to first nodes in PXC while reads does goes to multiple nodes.

A: Yes you can configure HAProxy to distribute both reads and writes across all of your nodes in a Percona XtraDB Cluster environment. Perhaps SysBench created only one database connection for all writes, and so haproxy kept those confined to only one host. You may want to experiment with parallel_prepare.lua.

Q: What’s the optimal HA for small datasets (db is less than 10gb)?

A: The optimal HA deployment for small datasets would be dependent on your level of recovery required (tolerance for loss of transactions) and time that you can be in an unavailable state (seconds, minutes, hours?).  Unfortunately there isn’t a single answer to your question, however, if you are interested in further discussion on this point Percona would be happy to coordinate a time to speak.  Please feel free to contact me directly and we can continue the conversation at michael.coburn@percona.com.

 Q: Is there a concept of local master vs. remote master with PXC?

A: No there is no concept of local vs remote master.  All nodes in a Percona XtraDB Cluster can now be classified as Master, regardless of their proximity to the clients.

Q: Are there any concerns when considering AWS RDS or AURORA DB for MySQL HA in the Cloud?

A: Regarding AWS RDS, yes this a good option for MySQL HA in the Cloud.  I unfortunately haven’t worked with Aurora DB that much yet so I don’t have an opinion on it’s suitability for HA in the Cloud.

Q: We tried out PXC awhile back and it used to lock everything whenever any ddl was done. Has that changed?

A: We would have to look at the specifics of your environment, however, there have been numerous improvements in the 1½ years of development since Percona XtraDB Cluster went Generally Available (GA) on January 30th, 2014 in version 5.6.15.

Q: Is using the arbitrator a must?

A: No the arbitrator role via the garbd daemon is generally only used when operating in a minimal environment of two nodes that contain the data and you need a third node for quorum but don’t want to store the data a third time.

Q: Can we do a cluster across different zones?

A: Yes you can. However be aware that the latency incurred for all cluster certification operations will be impacted by the round trip time between nodes.

Q: Does PXC also support the MyISAM database?

A: No, Percona XtraDB Cluster does not support any storage engine other than InnoDB as of PXC 5.6.

Q: How do load balancers affect the throughput in a Galera-based setup given that the write would be limited by the slowest node?

A: Load balancers will introduce some measure of additional latency in the form of CPU time in the load balancer layer as it evaluates its own ruleset, and also in network time due to additional hop via load balancer.  Otherwise there should be no perceptible difference in the write throughput of a Percona XtraDB Cluster with and without a load balancer as it relates to the “slowest node” factor.

Q: Have you used MaxScale yet? If so, what are your thoughts?

A: Unfortunately I haven’t used MaxScale however Yves Trudeau, Percona Principal Architect, has recently written about MaxScale in this blog post.

Q: How do you configure timeout and maintain persistent connection to HAProxy?

A: I would encourage you to refer to the HAProxy Documentation.

The post Q&A: High availability when using MySQL in the cloud appeared first on MySQL Performance Blog.

Monitoring MySQL flow control in Percona XtraDB Cluster 5.6

Monitoring flow control in a Galera cluster is very important. If you do not, you will not understand why writes may sometimes be stalled. Percona XtraDB Cluster 5.6 provides 2 status variables for such monitoring: wsrep_flow_control_paused and wsrep_flow_control_paused_ns. Which one should you use?

What is flow control?

Flow control does not exist with regular MySQL replication, but only with Galera replication. It is simply the mechanism nodes are using when they are not able to keep up with the write load: to keep replication synchronous, the node that is starting to lag instructs the other nodes that writes should be paused for some time so it does not get too far behind.

If you are not familiar with this notion, you should read this blogpost.

Triggering flow control and graphing it

For this test, we’ll use a 3-node Percona XtraDB Cluster 5.6 cluster. On node 3, we will adjust gcs.fc_limit so that flow control is triggered very quickly and then we will lock the node:

pxc3> set global wsrep_provider_options="gcs.fc_limit=1";
pxc3> flush tables with read lock;

Now we will use sysbench to insert rows on node 1:

$ sysbench --test=oltp --oltp-table-size=50000 --mysql-user=root --mysql-socket=/tmp/pxc1.sock prepare

Because of flow control, writes will be stalled and sysbench will hang. So after some time, we will release the lock on node 3:

pxc3> unlock tables;

During the whole process, wsrep_flow_control_paused and wsrep_flow_control_paused_ns are recorded every second with mysqladmin ext -i1. We can then build a graph of the evolution of both variables:

wsrep_flow_control_pxc3

While we can clearly see when flow control was triggered on both graphs, it is much easier to know when flow control was stopped with wsrep_flow_control_paused_ns. It would be even more obvious if we have had several timeframes when flow control is in effect.

Conclusion

Monitoring a server is obviously necessary if you want to be able to catch issues. But you need to look at the right metrics. So don’t be scared if you are seeing that wsrep_flow_control_paused is not 0: it simply means that flow control has been triggered at some point since the server started up. If you want to know what is happening right now, prefer wsrep_flow_control_paused_ns.

The post Monitoring MySQL flow control in Percona XtraDB Cluster 5.6 appeared first on MySQL Performance Blog.

ScaleArc: Benchmarking with sysbench

ScaleArc recently hired Percona to perform various tests on its database traffic management product. This post is the outcome of the benchmarks carried out by Uday Sawant (ScaleArc) and myself. You can also download the report directly as a PDF here.

The goal of these benchmarks is to identify the potential overhead of the ScaleArc software itself and the potential benefits of caching. The benchmarks were carried out with the trunk version of sysbench. For this reason, we used a very small set of data, so the measurements will be fast, and it’s known that caching has huge benefits when the queries themselves are rather expensive. We decided that we would rather show that benefit with a real-world application, which is coming later is this series. And if you’re in the Silicon Valley area, be sure to join us this evening at the first-ever Open Source Appreciation Day – I’d be happy to discuss the findings presented here in this post. Admission is free but due to limited space you should register now. I’ll also be available throughout the Percona Live MySQL Conference and Expo all this week.

sysbench_image1.2

In this summary graph it’s visible that in terms of throughput (read-only benchmark, which is relevant for read mostly applications), ScaleArc doesn’t have any significant overhead, while caching can have potentially huge benefits.

sysbench_image2

The situation is pretty similar with response times. ScaleArc doesn’t add any significant overhead, and caching can mean huge benefit in terms of response time as well.

In case of this particular workload (which is read only sysbench), using caching means a roughly 3x increase in throughput and a roughly 80% drop in response time.

Overall, ScaleArc is a good product in terms of performance and features as well. I would definitely recommend it.

About ScaleArc for MySQL
ScaleArc for MySQL is a software appliance that drops in transparently between applications and databases to improve application availability and performance. It requires no changes to applications or databases and delivers:

  • Instant scale up – transparent connection pooling and multiplexing, TTL-based transparent caching, surge protection
  • Transparent scale out – read/write split, load balancing, query routing, sharding
  • Automatic high availability – automatic failover
  • Real-time actionable analytics

Benchmarking setup
The client machines are running the benchmarking software like sysbench in case of these benchmarks.

CPU: 2 x Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz (6 cores, chip multithreading off)
Memory: 64G

We used 2 clients. The results of the 2 clients are graphed separately, so it’s visible that they put the same amount of workload on the database or ScaleArc software.

Database machines
CPU: 2 x Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz (6 cores, chip multithreading off)
Memory: 64G

Running MySQL Community Edition 5.6.15

MySQL configuration

[mysqld]
   max_allowed_packet = 64M
   thread_cache = 256
   query_cache_size = 0
   query_cache_type = 0
   max_connections = 20020
   max_user_connections = 20000
   max_connect_errors = 99999999
   wait_timeout = 28800
   interactive_timeout = 28800
   log-error=/var/lib/mysql/mysql.err
   back_log=60000
   innodb_buffer_pool_size = 3G
   innodb_additional_mem_pool_size = 16M
   innodb_log_buffer_size = 8M
   innodb_flush_log_at_trx_commit = 0
   innodb_flush_method = O_DIRECT
   innodb_open_files = 2000
   innodb_file_per_table
   innodb_log_file_size=2G
   innodb_log_files_in_group=2
   innodb_purge_threads=1
   innodb_max_purge_lag=0
   innodb_support_xa=0
   innodb_locks_unsafe_for_binlog = 1
   innodb_buffer_pool_instances=8
   sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES

The buffer pool of the database is intentionally small, so it’s easy to generate a disk-bound workload.

Please note that the following settings are not recommended in production.

innodb_support_xa=0
   innodb_locks_unsafe_for_binlog = 1

 

We used these settings to drive the node to its peak performance, avoiding any possible overhead which might be required on a production system. In typical production settings, these are not set, and binary logging is enabled, which potentially reduces ScaleArc’s overhead further.

ScaleArc software appliances
CPU: 1 x Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz (6 cores, chip multithreading off)
Memory: 64G

The machines were running ScaleArc for MySQL 3.0.

Network
The machines were connected using 10G connections.

Measurements
All of the measurements were done with a very small database that completely fits in memory.

--oltp-table-size=10000
  --oltp-tables-count=64

 

In these benchmarks, we expected both the database and ScaleArc to be CPU bound. In case of a disk-bound workload, ScaleArc would shine even more than in this benchmark. If the queries are more expensive (they have to hit storage), the overhead in % is smaller, and in case of caching the query-by-query benefit is bigger.

We measured 3 different setups, both on read-only and read-write cases. These are the following.

  • Direct connection to the database.
  • Connection to the database through ScaleArc, where ScaleArc only acts as a pass-through filter (since it’s a load balancer that speaks the MySQL wire protocol, all the mechanics for that are still in place). Please note that this setup doesn’t make sense in real life. The purpose of this setup is to show the potential overhead of using ScaleArc and uncovering potential limitations of the ScaleArc software itself.
  • Connection to database through ScaleArc, where ScaleArc is allowed to cache. Caching in ScaleArc is TTL (Time To Live)-based caching, meaning that a read query’s results are cached in ScaleArc. If that read query is seen again before expiring, the query is not run again on the database server but rather served from the cache. Once the timer for the cached query expires, the query will be issued on the database again. Caching of course only works for reads, which are not in an explicit transaction (autocommit is on and no START TRANSACTION is issued). Because of that, we used –oltp-skip-trx during cached benchmarks (read-only case). In case of these benchmarks, the TTL was 1 hour, because we wanted to saturate the ScaleArc software while serving cached queries. An 1 hour TTL might be unrealistic for some applications, while for other applications even an 1 day TTL is something they can live with for some queries. In this case, we wanted to measure the cache’s performance, so we wanted the queries to be cached during the entire benchmark run to show the potential gain even in case of very small queries.

TTL-based caching
It’s important to note that the cache’s expiration is controlled by a TTL value – there is no other invalidation, so it’s possible to read stale data when the query results is changed, but the cache is not expired. Reading stale data alone is ok for most applications, it can happen with a regular, asynchronous slave if it’s lagging behind the master (and it always lagging behind somewhat). Otherwise, the cache is pretty similar to MySQL’s query cache, which doesn’t suffer from the stale read problem, but it has a coarse invalidation (if a table is written, the cache entries belonging to the given table are flushed). While the cache is flushed, the query cache mutex is held, which blocks reads even. Because of the mutex, the built-in query cache is a very usual performance bottleneck. ScaleArc’s cache doesn’t suffer from this.

It’s important to note that ScaleArc caches nothing by default. Also, there are other ways to invlidate cache entries apart from waiting for the TTL to expire.

  • API Call based invalidation (you can clear the cache for an entire query pattern rule with one API call)
  • Query comment based invalidation (you can put a comment /*wipe*/ before a query and wipe and refresh the cache)
  • Cache Bypass (you can send a comment /*nocache*/ and bypass the cache for that specific query)

Read-only
Sysbench throughput

sysbench_image6

In the lower region of threads (up to 32), we see that the TPS value significantly drops in case of going through ScaleArc. That’s nothing to be surprised about, the reason for that is network roundtrips. Because ScaleArc is a software appliance, it adds a hop between the database and the application, which introduces latency. If the number of threads is higher (32 and up), this starts to matter less and less, and performance is almost identical which is very impressive. It means that around the optimal degree of parallelism for these machines, ScaleArc introduces very little (barely measurable) overhead.

Sysbench response time
sysbench_image7
This graph contains the response times belonging to the previous benchmarks. This is really hard to read because at 4096 threads, the system is overloaded, and the response time is much more than in the maximum throughput region. Because it’s multiple orders of magnitude higher, the interesting response times are not readable from this graph.

sysbench_image8

The following graph is the same as above, except that the y axis is limited to 250 ms, so the region which is not visible on the graph above is visible here. What we see there regarding the overhead is pretty much the same as we saw in case of the throughput graph, which means that ScaleArc by itself introduces immeasurably low latency (which explains the difference in cases when parallelism is low). Usually applications which are utilizing the database server are using significantly more than one thread (in MySQL a single query always uses a single thread, in other words there is no intra-query parallelism). The latency from 32 threads above is actually somewhat lower when going through ScaleArc (the exact tipping point can be different here based on the number of CPUs). The reason for that is ScaleArc itself uses an event loop to connect to MySQL, so at a high concurrency, and can schedule sending the traffic to MySQL differently. This only matters when otherwise the MySQL server is saturated CPU-wise.

CPU utilization

sysbench_image9

Last but not least, this graph contains the CPU utilization of the different setups. The left-hand side shows the CPU utilization when connecting directly to the database, and the right-hand side shows connecting through ScaleArc. In both cases, the database server’s CPU is the bottleneck. It’s visible that the client node’s CPU is more than 75% idle (only client1 is graphed to improve readability, client2 is practically the same). From 32 threads and up, the blue bar (CPU user%) is relatively high on the database servers, as is the green (CPU sys%). From 64 threads, the idle time is practically 0, until the systems are overloaded. On the right hand side, we can see that ScaleArc at this load still had 50% idle CPU, which means that we could practically do the same benchmark on another set of boxes through the very same ScaleArc, and only then it will be fully utilized. We are talking about 3000 sysbench tps here. One more interesting thing to note is the relatively high system time of ibd. This is also because of the way ScaleArc connects to the database (see the previous paragraph).

[  17s] threads: 64, tps: 3001.98, reads/s: 41962.70, writes/s: 0.00, response time: 35.22ms (95%)

 

These threads are from a single client, which means that ScaleArc could keep up with parsing roughly 84000 statements / second with utilizing half of its CPU, which is impressive. Please note that the ScaleArc software in this case was tuned towards this type of workload, which means we had more query processing threads. In case of caching, we will have more cache handler threads.

Effects of caching on read-only workload
Sysbench throughput

The next set of graphs will compare the cases when cache is used and not used.

sysbench_image11

The preceding TPS graph contains reads / second (because we measured with –oltp-skip-trx), so roughly 42000 reads corresponds to roughly 3000 transactions in the earlier setup (14 reads in a transaction). On the left-hand side of the graph, the cached throughput is visible with green – on the right-hand side, the non-cached throughput is visible with red (direct access) and blue (access through ScaleArc as a pass-through filter). It’s visible that caching improves the speed drastically, but when ScaleArc becomes overloaded (8192 client threads, 4096 from each client), the performance becomes somewhat inconsistent, which is understandable considering how few cores ScaleArc was running on. On the graph, the dots are translucent, which means the colors are brighter in the areas that have more samples. Even in the overloaded case, the majority of the samples are in the region of 100k+ reads / second across two clients, which means that the performance degrades very gracefully even under heavy load.

Sysbench response time

sysbench_image12

Like in the case of a non-cached workload, the response times are not too readable because of the very high response times when the systems are overloaded. But from the overloaded response times visible, it seems like using caching doesn’t make response times worse.

sysbench_Image13

Like in the case of non-cached workload, this graph is the zoomed version of the previous one. Here the maximum of the y axis is 100 ms. From this graph, it’s visible that at lower concurrency and at the optimal throughput, caching actually helps response time. This is understandable, since in case of a cache hit, ScaleArc can serve the results, and the client (in our case here sysbench) doesn’t have to go to the database, so a roundtrip and database processing time is spared. It’s also worth mentioning that the data “comes from memory,” it doesn’t matter if we hit the ScaleArc cache of the database. When the ScaleArc cache is used, the response time is lower because the additional roundtrip to the database and potential database work (like parsing SQL) is avoided. This means that caching can have benefits even if the database fits in the buffer pool. The improvement is always subject to the workload – caching helps the most when it can cache relatively expensive queries like aggregations and queries hitting the storage.

CPU utilization

sysbench_image14

Similarly to the previous case, the preceding graph shows CPU utilization of the various components. In case of the cached workload, the client itself is much more utilized (since it gets responses sooner, it has to generate the traffic faster). With this kind of workload, when using only one client, we would hit the client’s CPU as the performance bottleneck. The database is interesting too. With caching, its CPU is barely used. This is because if a query is served from the cache, it never gets to the database, so the database’s CPU utilization will be lower. In other words, using the cache helps to offload the database. If offloading is visible on ScaleArc’s graphs, when caching is used, the CPU on the server hosting ScaleArc is much more utilized. For this benchmark, the ScaleArc software was tuned to handle a cached workload, which means more cache handler threads.

Read-write
For read-write benchmarks, we had to create oltp_nontran.lua, which is the same sysbench benchmark as oltp.lua, except that it does the reads outside of the transaction and does only the writes in transaction, so caching can have an effect on read. The rest of the benchmarking setup is the same as the read-only case.

Sysbench throughput

sysbench_image15

Similarly to the read-only case, at a low concurrency, the overhead of ScaleArc is coming from the additional network roundtrip. At the optimal concurrency, the overhead is barely measurable (the dots are plotted practically on top of each other).

Sysbench response time

sysbench_image16

sysbench_image17

The case is pretty similar with the response times as in the read-only case. Similarly, the second graph is a zoomed version of the first one, which a 250 ms maximum.

CPU utilization

sysbench_image18

The CPU utilization graph shows that in this case, the database server’s CPU is the bottleneck. What is interesting is that ScaleArc is using less CPU than in the read-only case. This is understandable, since a transaction now contains writes as well, which are expensive on the database side, but they are still just statements to route on the ScaleArc side.

Effects of caching on read-write workload
Measuring caching here is interesting because the workload is no longer read-only of mostly reads. We have a very significant amount of writes.

sysbench_image19

For 30k reads, we get 8,5k writes. It’s expected that caching won’t help as much as in the previous case, because writes can’t be cached and while they are in process, the benchmarking threads can’t proceed with reads. Please note that this means that roughly 25% of the traffic is write, a typical application scaling out with additional slaves for reads doesn’t have this kind of read-to-write ratio.

Sysbench throughput

sysbench_image20

The first graph shows that in terms of total throughput, caching still helps.

Sysbench response time

sysbench_image21

sysbench_image22

Similarly to the read-only case, caching also helps response time, because it reduces the time needed for the read part of the workload.

CPU utilization

sysbench_image24

This test really stresses the database server’s CPU when not caching. With caching on, similarly to the read-only case, the client’s workload increases somewhat (but not as much), and the database server’s CPU usage decreases significantly. In the last row, the CPU utilization of ScaleArc shows that although it’s somewhat higher with caching, it’s still not that much higher.

From these tests it’s visible that caching can still be beneficial even if the write ratio is as high as in this test.

Conclusion
Engineering is always about making the right tradeoffs. If one wants features that needs a protocol-level load balancer like ScaleArc, the price should be paid in the overhead of Layer 7 parsing and decision making. ScaleArc’s engineering team did a great job minimizing this overhead. ScaleArc itself is very well tunable for different workload types (if caching is important, ScaleArc can be tuned for caching – if query rewriting, ScaleArc can be tuned for that).

The post ScaleArc: Benchmarking with sysbench appeared first on MySQL Performance Blog.