Showing posts with label Performance. Show all posts
Showing posts with label Performance. Show all posts

Wednesday, October 27, 2010

Important Wcf performance issue + workaround

@YaronNaveh

I have written about Wcf performance issues before, but this one seems to be the biggest. Valery had published in the Wcf forum an interesting performance issue. In short, a WCF client tries to consume a non-WCF service where the contract looks something like this:

class Foo
{
   
byte[] picture;
}


In soap, byte arrays are encoded as base64 strings so it can look like this:

<picture>/9j/4AAQSkZJReV6R8MLi7nW6UUUViWf/Z.....</picture>

or with line breaks after each 73 characters, like this:

<picture>/9j/4AAQSkZJReV6R8MLi7nW61+58zBz5Q+7Xpdj
/PK/4AAQSkPOIeV6R8MLi7nW61+58zBz5Q+7Xpdj
/9R/4AAQSkZJReV6R8MLi7nW6VZ788zBz5Q+7Xpdj
4U4wVoqwUUUViWf/Z</picture>

both options are valid according to the base64 RFC:

Implementations MUST NOT add line feeds to base-encoded data unless
the specification referring to this document explicitly directs base
encoders to add line feeds after a specific number of characters.

Ok so it does not really advocate this... But it is a fact that many soap stacks still use this MIME-originated format and also Wcf supports it.

So what is the problem?
It seems that when Wcf gets a message which contains base64 with CRLF, the processing is slower in a few seconds(!). A drill down shows that the problem is in the DataContract serializer. Take a look at this program:

[DataContract]
public class Foo
{
   
[DataMember]
   
public byte[] picture;
}

class Program
{
       
static void Main(string[] args)
       
{
           
var t1 = getTime(@"C:\temp\base64_with_line_breaks.txt");
           
var t2 = getTime(@"C:\temp\base64_without_line_breaks.txt");            
          
           
Console.WriteLine("Time with breaks: " + t1);
           
Console.WriteLine("Time with no breaks: " + t2);

           
Console.ReadKey();
       
}

       
static double getTime(string path)
       
{
           
var ser = new DataContractSerializer(typeof (Foo));
           
var stream = new FileStream(path, FileMode.Open);
           
var start = DateTime.Now;

           
for (int i = 0; i < 40; i++)
           
{
               
ser.ReadObject(stream);                
               
stream.Position = 0;
           
}

           
var end = DateTime.Now;
           
var t = end - start;
           
return t.TotalSeconds;
       
}
}

For those of you who are interested to test this, the files are here and here.

The output is:

Time with breaks: 10.8998196 seconds
Time with no breaks: 0.0029994 seconds

This clearly reveals a performance problem.

Why does this happen?

While debugging the .Net source code, I have found this in the XmlBaseReader class (code comments were in the source - they are not mine):


int ReadBytes(...)
{
  try
 {
   ...
 }

 catch (FormatException exception)
  
{
      
// Something was wrong with the format, see if we can strip the spaces

      int i = 0;
      
int j = 0;
      
while (true)
      
{
          
while (j < charCount && XmlConverter.IsWhitespace(chars[j]))
              
j++;
          
if (j == charCount)
               
break;
          
chars[i++] = chars[j++];
      
}
...
}
}

So the data contract serializer tries to read the base64 string, but for some reason succeeds only if the string does not have white spaces inside it (we can further debug to see how that happens but it is exhausting for one post :). The serializer then removes all the white spaces (which requires copying the buffer again) and tries again. This is definitely a performance issue.

Notes:

  • This happens with both .Net framework 3.5 and 4.0.

  • This is a DataContract specific issue - it does not happen when you use other .Net mechanisms such as Convert.FromBase64String

    I have reported this in Microsoft connect, you are welcome to vote this issue up.

    Workarounds

    There a few workarounds. The trade-offs are generally convenience of API (or "where you prefer to put the 'dirty' stuff").

    1. As Valery noticed, you can change the contract to use String instead of byte[]. Then Convert.FromBase64String will give you the byte array.

    2. Change your contracts to use the XmlSerializer instead of DataContract serializer. The former does not experience this issue. The XmlSerializer is generally slower (when base64 does not appear that it) so this is what you loose. You get a better API here as clients do not need to manipulate the base64 string.

    3. The best of course is to change the service implementation to return base64 without line breaks. Also if large binaries are returned anyway it may be a better idea to employ MTOM.

    4. A Wcf custom encoder can strip the spaces from the message before it is deserialized. However this also involves copy of strings and this is beneficial only in rare cases.
  • @YaronNaveh

    What's next? get this blog rss updates or register for mail updates!

    Sunday, July 25, 2010

    WCF Gotcha: Binary and MTOM encodings not optimal for untyped scenarios

    @YaronNaveh

    WCF binary encoding efficiently writes Xml in a binary optimized format. For example, this text message

    <s:Envelope …>
      <s:Header> 
        <!-- ws-addressing stuff… -->
       </s:Header> 
      <s:Body>
        <MyContract>
          <arr>
            <b:int>99999</b:int>
            <b:int>99999</b:int>
            <b:int>77777</b:int>
            <b:int>99999</b:int>
          </arr>
        </MyContract>
      </s:Body>
    </s:Envelope>

    is 1KB in the default text encoding but only half the size in binary (554 bytes). The reasons are various:
    • binary encoding knows to write int values as a one 4-bytes int and not as five 1-byte chars.
    • binary does not repeat the “int” element name more than once.
    • binary does not need to write known elements (e.g. Envelope, Body) but only the key in an optimized dictionary.
    The difference is clear if we look at the text and binary messages in Fiddler’s HexView:

    (click image to enlarge)

    Text

    Binary


    This code snippet is the one I used to send the above message in a binary format:

    [ServiceContract]
        public interface IUniversalContract
        {
            [OperationContract(Action = "*", ReplyAction = "*")]
            Message Send(Message message);
        }
        [DataContract]
        class MyContract
        {
            [DataMember]
            public int[] arr;          
        }
    private static IUniversalContract GetChannel()
            {
                var binding = new CustomBinding();
                binding.Elements.Add(new BinaryMessageEncodingBindingElement());
                binding.Elements.Add(new HttpTransportBindingElement());
                var factory = new ChannelFactory<IUniversalContract>(binding);
                return factory.CreateChannel(new EndpointAddress("http://localhost:8888/"));
            }
    static void Main(string[] args)
            {
    var channel = GetChannel();
    var obj = new MyContract { arr = new[] { 99999, 99999, 77777, 99999 } };                       
                Message msg = Message.CreateMessage(MessageVersion.Soap12WSAddressing10, "someAction", obj);
                channel.Send(msg);
    }

    You can see I have built a custom binding which uses the binary encoding. Then I have instantiated my data contract class and sent it.

    But take a look at this code:

    static void Main(string[] args)
            {
    var channel = GetChannel(); //same method as above

    var str =
                    new StringReader(
                        @" <MyContract xmlns=""http://schemas.datacontract.org/2004/07/ConsoleApplication288""
    xmlns:i=""http://www.w3.org/2001/XMLSchema-instance""><arr xmlns:b=""http://schemas.microsoft.com/2003/10/Serialization/Arrays""><b:int>99999</b:int>
    <b:int>99999</b:int><b:int>77777</b:int>
    <b:int>99999</b:int></arr></MyContract>");
              
                XmlReader reader = XmlReader.Create(str);
                Message msg = Message.CreateMessage(MessageVersion.Soap12WSAddressing10, "someAction", reader);
                channel.Send(msg);
    }

    It sends the same message on the wire but uses a raw xml message instead of a live object. This is common in routing scenarios, in cases where the message is manually handcrafted and in places where message is taken from an external source.

    Let’s see how this message looks like in Fiddler:



    its size is 577 bytes which means it has 23 extra bytes over the object-based binary message. While this seems negligible, had we sent more array elements, and had the integer values require additional digits, the difference would be very noticeable, especially for an optimized format. For example, with just 50 items in the array the second message was already 2.5 times larger the the fully optimized one.

    So from where does this difference come from? when analyzing the raw messages we find two interesting differences:
    • In the less optimized message the text “int” repeats a few times where in the optimized one it only appears once. As far as I understand this text here represents the array element name and not the fact that the type is an int.
    • In the less optimized message the array values are written as five 1-byte chars (=5 bytes per value) whereas in the optimized message they are a one 4-byte integer (=4 bytes per value). For higher integers the difference is bigger and for long arrays the difference is huge.
    The second difference is understandable – the binary encoder does not have type information since it only gets a big string as input. It then cannot optimize integers. I’m not really sure however as for the first issue – maybe the reason is that for integers we do not need array items separator as values are of fixed types whereas for strings we need to know when the string has ended. But still I would expect a more optimized separator so I’m not really sure about this one.


    Conclusion

    Wcf binrary encoding is a popular optimization technique, especially ever since it became available to Silverlight applications. However, When untyped messages are used, binary encoded messages are less optimal than in the typed scenarios. They are still more optimal than regular text encoded messages. The same corollary can be reached for MTOM scenarios as they use the same optimization technique. If you use such messages (usually in routing scenarios) you should be aware of this and consider the different trade-offs.

    @YaronNaveh

    What's next? get this blog rss updates or register for mail updates!

    Sunday, April 4, 2010

    Wcf first connection is slow

    @YaronNaveh

    You may have experienced the first connection from a Wcf client to a service to be extremely slow. In some cases this is expected as the first connection establishes a session which may require exchange of multiple messages. However in some cases there is no obvious reason for that.

    One cause may be a slow http proxy. By default, Wcf uses the default http proxy configured in internet explorer. The proxy may be specified directly or in form of a configuration script:





    If this is the cause of the slowness you can reproduce it by surfing to any site using a new IE instance - same slowness should occur.

    The possible solutions are:

    1. Remove the proxy configuration from IE.
    2. Configure your WCF binding with UseDefaultProxy=false.

    @YaronNaveh

    What's next? get this blog rss updates or register for mail updates!

    Friday, March 5, 2010

    DataSets Considered Harmful (part 2): Performance

    @YaronNaveh

    The first part in this series discussed various interoperability issues with DataSets. This time I want to discuss the performance implications.

    Let us sent the simplest DataSet to some server:

    var ds = new System.Data.DataSet();
    DataTable t = new DataTable();
    t.Columns.Add("col1");
    t.Columns.Add("col2");
    t.Rows.Add("row1col1", "row1col1");
    t.Rows.Add("row2col1", "row2col1");
    ds.Tables.Add(t);
    localhost.Service s = new Service();
    s.GetData(ds);

    Here is the generated Soap:

    <soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
        <soap:Body>
            <GetData xmlns="http://tempuri.org/">
                <value>
                    <xs:schema id="NewDataSet" xmlns="" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:msdata="urn:schemas-microsoft-com:xml-msdata">
                        <xs:element name="NewDataSet" msdata:IsDataSet="true" msdata:UseCurrentLocale="true">
                            <xs:complexType>
                                <xs:choice minOccurs="0" maxOccurs="unbounded">
                                    <xs:element name="Table1">
                                        <xs:complexType>
                                            <xs:sequence>
                                                <xs:element name="col1" type="xs:string" minOccurs="0" />
                                                <xs:element name="col2" type="xs:string" minOccurs="0" />
                                            </xs:sequence>
                                        </xs:complexType>
                                    </xs:element>
                                </xs:choice>
                            </xs:complexType>
                        </xs:element>
                    </xs:schema>
                    <diffgr:diffgram xmlns:msdata="urn:schemas-microsoft-com:xml-msdata" xmlns:diffgr="urn:schemas-microsoft-com:xml-diffgram-v1">
                        <NewDataSet xmlns="">
                            <Table1 diffgr:id="Table11" msdata:rowOrder="0" diffgr:hasChanges="inserted">
                                <col1>row1col1</col1>
                                <col2>row1col1</col2>
                            </Table1>
                            <Table1 diffgr:id="Table12" msdata:rowOrder="1" diffgr:hasChanges="inserted">
                                <col1>row2col1</col1>
                                <col2>row2col1</col2>
                            </Table1>
                        </NewDataSet>
                    </diffgr:diffgram>
                </value>
            </GetData>
        </soap:Body>
    </soap:Envelope>

    This Soap commits two “crimes”. The first one is the embedded schema. Since the wsdl is not typed, .Net must serialize the schema inside the soap. This can have serios performance implications with large schemas.

    The second issue is the rowOrder and hasChanges attributes over each data row. In most cases when you just want to pass a snapshot of data to a client so these are simply not required.

    The overall result is a bloated Soap without any meaningful benefit.

    @YaronNaveh

    What's next? get this blog rss updates or register for mail updates!

    Saturday, June 20, 2009

    Are WCF defaults considered harmful?

    @YaronNaveh

    When we programmers see such an error message


    quota 65536 too small please increase


    or something along these lines, and have no idea what this quota is good for, we face the temptation to put there a ridiculously large number (like 6.10* 8^23) so we would never have to it again. We should hold ourselves from doing this and put a rational number based on our needs. See the story bellow.

    Ayende published an interesting post on a case where he needed to send a large number of objects between a WCF client and server. For this he had to alter some server-side default:


    [ServiceBehavior(MaxItemsInObjectGraph = Int32.MaxValue)]


    When he "update service reference" on his client he found out that this setting is not propagated to the clients which forces him to manually change this setting in each and every client (as stated in MSDN).

    Arnon follows this up in his post and claims the following:

  • This setting needs to be automatically propagated to clients
  • There are other settings which are not propagated and needs to be, for example message size limits
  • The default setting should be higher (although not infinite)

    I absolutely agree with the first claim. This setting is in effect both when sending and receiving data. Since in each call one party sends and another receives this setting has to be correlated between the parties. The way to dispatch this setting to clients would probably by extending the wsdl's WS-Policy with this new setting (which would be msn proprietary for that matter).

    I only partially agree with Arnon's second statement. The MaxReceivedMessageSize setting (if it's the one he refers to) only affects the receiving side. There is no limit on the size of outgoing messages. Here it makes sense to have a different value for the client and the server since they probably have different capabilities in terms of hardware and they also need to handle different data.

    Going back to the opening paragraph, I want to make clear the rational behind all of these settings (and me and Arnon are probably in agreement on this). These settings are not meant to directly improve the performance of the service but rather they aim to block DOS attacks. So if the limit on this setting is too high an attacker can send an XML bomb which will consume large server resources. These settings are much more important for the server then for the client, but as long as clients allow to customize them the client values do not always have to be correlated with the server ones.

    @YaronNaveh

    What's next? get this blog rss updates or register for mail updates!
  • Saturday, April 18, 2009

    WCF Performance: Making your service run 3 times faster

    @YaronNaveh

    A lot of people use WCF default settings on production. In many cases changing these defaults can gear up the service throughput dramatically.

    Let's look at the following use case:


  • WsHttpBinding is used

  • Message level security is used: X.509 certificate or windows authentication, where client can also use a username/password or be anonymous

  • (Optional assumption) A typical client makes one service call and then disconnects



  • The WsHttpBinding implicit defaults can be explicitly written as bellow:


    <wsHttpBinding>
      <binding name="BadPerformanceBinding">
       <security mode="Message">
        <message clientCredentialType="..."
         negotiateServiceCredential="true"
         establishSecurityContext="true" />
       </security>
      </binding>
    </wsHttpBinding>


    Let's simulate a load on this service by employing many virtual users who constantly call the service one time and immediately disconnect. The number of users should be large enough such that service will use its max capacity. The results are:


    Transactions per second: 15.235
    Average time of a transaction: 1.4 seconds


    Note: I didn't use a super strong server here but as we can see below it shouldn't matter for our needs. Also a load of just a few minutes was enough to prove our theory.






    Those are not very good results of course.

    Now let's tweak the configuration a little bit:


    <wsHttpBinding>
      <binding name="BetterPerformanceBinding">
       <security mode="Message">
        <message clientCredentialType="..."
         negotiateServiceCredential="false"
         establishSecurityContext="false" />
       </security>
      </binding>
    </wsHttpBinding>


    And with the same amount of virtual users we get these results:


    Transactions per second: 51.833
    Average time of a transaction: 0.384 seconds


    That's 3.5 times faster!





    So, what happened here?
    Since the only change we did is in two settings we need to analyze each of them.

    negotiateServiceCredential
    This setting determines whether the clients can get the service credential (e.g. certificate) using negotiation with the service. The credentials are used in order to authenticate the service and to protect (encrypt) the messages. When this setting is set to "true" a bunch of infrastructure soap envelopes are sent on the wire before the client sends its request. When set to "false" the client needs to have the service credentials out of band.

    The trade off here is better performance (using "false") versus more convenience (using "true"). Setting "false" has its hassles as we now need to propagate the service credential to clients. However, performance wise, setting "negotiateServiceCredential" to "false" is always better.

    Take a look at how many infrastructure messages are exchanged when negotiateServiceCredential is "true":



    While when not negotiating life is much brighter:



    establishSecurityContext
    This setting determines whether WS-SecureConversation sessions are established between the client and the server. So what is a secure conversation anyway? In a very simplified manner we can say that a normal secured web service request requires one asymmetric encryption. Respectively, normal N requests require N asymmetric encryptions. Since asymmetric encryption is very slow, setting up a secure conversation is usually a good practice: It requires a one-time asymmetric encrypted message exchange in order to set up a session; Further calls in the session use symmetric encryption which is much faster.

    Now remember that in our case we assume that clients call the service just one time and disconnect. If a secure session is established the message exchange will look like this:


    Message 1: Setting up a secure session (asymmetric encryption)
    Message 2: The actual request (symmetric encryption)


    If we do not use secure session we have:


    Message 1: The actual request (asymmetric encryption)


    So it is clear that we're better off in the latter case.

    With secure sessions there isn't really any trade off and the decision is quite scientific: When only one client request is expected set establishSecurityContext to "false".

    Summary
    Wisely changing WCF defaults can yield a significant improvement in your service performance. The exact changes need to be made and their exact effect are dependent in the scenario. The example above showed how to speed up a certain service 3 times faster.

    @YaronNaveh

    What's next? get this blog rss updates or register for mail updates!

    Wednesday, October 15, 2008

    Cryptic WCF error messages (part 5 of N)

    @YaronNaveh

    This one isn't that cryptic actually but its cause is not always clear:


    System.TimeoutException:

    The request channel timed out while waiting for a reply after 00:01:00. Increase the timeout value passed to the call to Request or increase the SendTimeout value on the Binding. The time allotted to this operation may have been a portion of a longer timeout.


    There can be various reasons why a proxy would throw such an exception with the main one being that the server is not available. However one other reason is that the maximum number of allowed sessions was reached. Some bindings (e.g. wsHttpBinding) under some configurations (e.g. Security or ReliableMessaging) cause the server to open a session with each unique client that accesses the service. A session is closed after the client explicitly closes the proxy or when the client is inactive for some time. There can only be a limited number of open sessions. When this limit is reached clients get the above timeout exception. In case you want to raise this limit you need to increase the number of allowd sessions


    <behaviors>
       <serviceBehaviors>
         <behavior name="ServiceBehavior">
           <serviceMetadata httpGetEnabled="true" />
           <serviceDebug includeExceptionDetailInFaults="false" />
           <serviceThrottling maxConcurrentSessions="90" />
         </behavior>
       </serviceBehaviors>
    </behaviors>


    Or from the configuration editor open the behaviour (#1 in image bellow), add an element (#2) and choose the "serviceThrottling" element (#3). Then change the number of sessions (second image bellow).







    You shouldn't allow too much concurrent sessions as it can hurt service performance. The most important to remember is to explicitly close your proxy after you have finished using it:


    client.Close();

    @YaronNaveh

    What's next? get this blog rss updates or register for mail updates!