Archive for February, 2010

LTE & 3G False Alarms

Thursday, February 25th, 2010
RSS Feed Subscribe to EtherNEWS Bookmark and Share

Capacity and next generation mobile services (3G & 4G/LTE) seem to be constantly under scrutiny.   Ever since the iPhone came on the scene and sucked the lifeblood out of at&t’s backhaul network we constantly hear about the impending doom, the bandwidth desert we’re all facing ahead.  This has been labeled “The Capacity Crisis” – here’s an example of one of a gazillion articles harping on the uncertainty of our mobile broadband future.  Sound a bit like the swine flu?  What ever happened to that?

One thing you learn working with real operators doing real deployments is that:

  1. backhaul capacity is something they dealing with (don’t lose too much sleep);
  2. there are bigger issues: real deployment challenges to figure out first.

And field trials for 3G & 4G are full of such examples.  No one’s finding an issue getting bandwidth to the cell site – no magic formula is required for that – simply put, if a fiber is laid or a good microwave connection is setup the capacity is there, pretty much on tap.  The issues that operators are stumbling over have more to do with the operational nuts and bolts.  A lot of new technologies are getting put through their paces at the same time, and some that work great in the lab seem to be falling short in the field.

Ethernet OAM: Lies, Lies & More Lies

One of the key technologies almost every operator is counting on is Y.1731 – the popular Ethernet operations, administration and maintenance (OAM) standard for connectivity fault monitoring (CFM) and performance monitoring (PM).  Y.1731 is a must, and for good reason: it’s the only standards-based QoS monitoring method available to assure Ethernet latency, jitter, frame loss and availability meet the demanding targets required for packet backhaul.  It works in multi-vendor networks; it works in multi-operator networks (great for using and keeping tabs on wholesale backhaul carriers).  Every network element maker selling into backhaul has it in their products and they’re all tuned up and ready to go.  Are they?

A recent field trial in a 3G deployment in North America went into crisis mode when one leading mobile operator turned on OAM PM to verify latency over their backhaul provider’s network.  The one-way latency target (and SLA) from mobile switching center (MSC) to tower was set at 5ms.  Y.1731 measured 20ms.  The mobile operator freaked.  The backhaul carrier claimed 3ms.  What was up?

Using an alternative test method transparent to OAM processing, the mobile operator confirmed the 3ms, giving both carriers another problem to solve: why were the OAM measurements in error by more than 300%?  The first step was to turn off OAM at all intermediate nodes in the network – suddenly Y.1731 PM measurements said 3ms.  They turned it back on: 20ms.  It’s important to point out here that the delay only affected OAM traffic – real traffic was unaffected and was meeting spec the whole time!  With the problem isolated to OAM processing itself, they were starting to experience something most network element vendors knew full well might turn up, but were hoping would go unnoticed.

oam-delays

The problem?  Most switches and routers claim to offer the full Y.1731 feature set, but none of this was thought out when the products were originally architected.  When Y.1731 became a must-have for backhaul, the features were typically shoe-horned into a software patch.  Running delay-sensitive monitoring features in software is a big faux-pas, because shared CPU time in the network element is a poor place to do anything critical.  These CPUs are busy doing more important things (like routing / switching functions) most of the time, putting OAM into background processing queues.  When traffic is at its peak, the network elements are heavily taxed – and just when you need performance measurements the most, they turn out the least accurate of all.

oam-delays2

Scary stuff.  In this case, every latency alarm the operators saw wasn’t an indication of network performance issues, but of CPU processing restrictions.  Not a very useful alert.

There of course ways to fix this situation, and these two operators came to their own conclusions and had things humming a little while later.  OAM can certainly work in large-scale, multi-provider deployments, and can assure critical services.  It just takes a few tricks and some solid, hardware-based OAM devices to help things out.

y1731-flows

This gets especially critical when you consider the OAM flows hitting the MSC: expect 1,000’s at a time as CFM and PM for 3 service classes from say, 250 towers, converge at a single router.

We’ve been getting a lot of calls in the middle of the night recently, and things can always be worked out.  Let’s just say none of these calls are about ‘The Capacity Crisis’.  That’s for the media to worry about.


RSS Feed Subscribe to EtherNEWS Bookmark and Share

Wishes from the MWC

Wednesday, February 24th, 2010
RSS Feed Subscribe to EtherNEWS Bookmark and Share

Post image for Wishes from the MWC

The Mobile World Congress was full of ideas about apps, lifestyle, mobile workflows and processes that can be enhanced by mobility.  I was stunned and amazed at how little promotion there was about Ethernet and Mobile Backhaul.

In 2009 the lack of adequate Mobile Backhaul implementations were all over the press in the US and UK.  One would think that the participants at the MWC would be promoting their latest mobile broadband solutions with fanfare.  It seems that the Elephant in the room will remain uncomfortably ignored.

This should be (is) the paramount issue in Mobile Networking today and equipment vendors should be investing resources to come up with solutions.  All of the entertainment, applications and business solutions promoting data usage are useless if the bandwidth end-to-end is not there to support them.

Fortunately I was in a position to engage multiple Mobile Carriers and Telecommunications Vendors at the MWC.  Obviously the paramount concern around bandwidth has been answered with Ethernet in the backhaul but the devil is in the details.  Throwing bandwidth at the problem has proven time and time again to be a only a band-aid.

From the discussions I had with operators and vendors there were a few common themes and/or concerns.  Here they are in summary.

Service Visibility

The single most important point Mobile Carriers are concerned about today is ensuring visibility into the service layers of their network.  In other words, they want to measure, monitor, trend  and manage their networks as accurately as possible.  Currently TDM/ATM services are used for backhauling and there are tools inherent to these services which provide a level of visibility and comfort to Carriers.  With the move to Ethernet in the backhaul, mobile carriers are loathe to give up these capabilities.

In many cases these carriers do not own their own wireline backhaul infrastructure.  These carriers could just trust their leased infrastructure to carry the traffic according to the SLAs they have contracted, however we know in this day of “elastic bandwidth” via IP/MPLS that SLA’s are easily compromised.

I can summarize this with a quote from one carrier I spoke with at the show who said,

I ordered a 20 Mbit ethernet service with a maximum latency of 10 milliseconds in October 2009 all under SLA.  I measured it last week and I was getting 7Mbit with a latency of over 40 milliseconds.  What I thought was a dedicated service was actually an MPLS based service with no guarantees, so we invoked the SLA penalties.

Active Service Monitoring

Standards based Ethernet OAM is the answer most vendors and operators have for ensuring active continuous monitoring of the network.  These technologies can be implemented in a way that the operator would know of any issues in the network, most likely, before a customer notices any problems.  Such an implementation is necessary when actively monitoring a mobile network.

What I thought was a dedicated service was actually an MPLS based service with no guarantees, so we invoked the SLA penalties

However, the devil is in the details.  Ethernet based OAM requires a lot of processing horsepower when trying to actively monitor thousands of base-stations simultaneously.  Existing routers and switches in the network aren’t (yet) architected for such a function.  A number of carriers have proven, with catastrophic results, that their routers and switches aren’t able to handle such a load.

Furthermore and probably the most important point, Ethernet OAM only measures Ethernet.  All of the services running over these phones are either IP based today or will be IP based in the future, even voice.  Carriers I have spoken with have expressed a wish to monitor the IP and IP service layers as much, if not more, than the transport layer.

Performance Management

One-way measurements is at the top of the list of Carriers wishes for service monitoring over Ethernet based networks.  While measurements on latency, litter and delay are all necessary Ethernet currently only measures round-trip performance.  This limitation is one of the main reasons why mobile backhaul has taken so long to accept Ethernet as a transport technology.

In order to provide one-way measurements there needs to be some way of handling Ethernet in a synchronous-like fashion.  Since Ethernet, by definition, is an asynchronous technology, this is no easy feat.

Standards bodies are reviewing all of the potential methods to make synchronous-like behaviors possible in Ethernet (1588v2, SyncEth) but we’re at least 1 year away from a real solution.  That doesn’t address the fact that the millions of installed Ethernet ports in carriers networks worldwide lack the hardware to take advantage of any new technological advancements.  I am not so sure that 1588 v2 or Synchronous Ethernet, in their current designs, will ever see mass adoption.

A number of carriers have proven, with catastrophic results, that their routers and switches aren’t able to handle such a load

Another equally important aspect of performance management is to use the measurements for trending purposes.  Many issues in a network aren’t visible right away but rather creep-up over time.  The ability to save measurements over-time and then apply policies and rulesets for interpretation can yield another perspective of the health of the network.  This is especially important when the transport of the backhaul network is heterogeneous to highlight any long-term degradations that can be masked by digital services.

Traffic Condiditoning

When a service traverses multiple transport/physical technologies a number of issues can (and will) arise.  These issues can manifest themselves as increased latency, jitter or re-transmission in the case of IP/data services.  These issues cause degradation of service, inefficient bandwidth usage and frame-loss.  One should also remember that there are differences between Fast Ethernet and Gigabit Ethernet that will give headaches if not allowed for.

These issues aren’t  exclusive to media or protocol conversion but can also be traced to inadequately engineered hardware.  Some base station vendors (who shall remain nameless) might be good at building radios but provide less than acceptable ethernet interfaces for backhaul.  Then we must take into account that many mobile carriers around the world have many types and brands of base stations in operation.

Carriers are now asking for backhaul solutions that can condition the traffic between the wireless and wireline networks (i.e. backhaul) to ensure that none of the aforementioned artifacts (jitter, delay, disruption) are minimized, if not eliminated altogether.  Conditioning of traffic cannot be limited to the transport or Ethernet layer.  The IP and IP service layers must also be conditioned to transit the network as efficiently as possible.

In Conclusion

The MWC was a fantastic opportunity to talk with a number of carriers and vendors who all know that there is an issue but I have yet to see anyone who has a complete end-to-end solution.  Some vendors have great radios, some vendors monitor excellently, and some vendors offer a cost efficient way of backhauling.  As has happened many times before a very clever network engineer somewhere around the world will come up with a way to solve these problems elegantly without going way over budget and ensuring a consistently good revenue stream.

Are you that engineer ;-) ?

John McCann
http://mccanntelecom.com/


RSS Feed Subscribe to EtherNEWS Bookmark and Share

Mobile World Congress

Monday, February 22nd, 2010
RSS Feed Subscribe to EtherNEWS Bookmark and Share

Mobile World Congress Barcelona

I’m currently in Barcelona at the Mobile World Congress.  I didn’t expect this show to be very well attended this year and on Monday I was beginning to think it might be a waste of time to be here. Tuesday and Wednesday proved me wrong and I’ve been pleasantly surprised.  The amount of foot traffic, even in the smaller halls, is significant.

The buzz this year has been around mobile applications.  The amount of people in this space is mind boggling.  How applications will affect our lifestyle was apparent on every 2nd stand.  One theme I saw repeated over and over again was “How you can be a journalist armed only with an iPhone”.

This explains the dearth of updates recently as preparing and working a show such as this takes a lot of time.

It’s now time to break down the Accedian stand and try to find a beer so I’m cutting this post short.  I hope to have a larger post covering other aspects of this year’s show done by this weekend.

John McCann    http://mccanntelecom.com/

RSS Feed Subscribe to EtherNEWS Bookmark and Share