Kitz ADSL Broadband Information
Plusnet Broadband
 
     
support site  Support this site
 
PayPal Donate


site index
site search

adsl bar

Dynamic Line Management - BT's DLM process

adsl bar
 

 

Openreach FTTC and Wholesale ADSL use the same BT DLM system.  Both share many of the same features, but there are 3 sub-systems for 20CN, 21CN & FTTC as each has slightly different profiles and parameters. The NGA FTTC system is operated by BT Openreach.
This tutorial looks at the DLM Function, focusing on which parameters are monitored, how it classifies a line and calculates if any changes needs to be made to the DLM profile.

Dynamic Line Management

~ DLM Introduction

BT Wholesale's Dynamic Line Management System is an extremely large topic.  In an attempt to make it easier to digest, the subject has been split into several pages which you may find useful to aid understanding how the DLM works as a whole:

  1. The DLM System: - Looks at the hardware & software systems used to monitor broadband lines. Knowing what each device in the DLM system is responsible for and what it does helps to visualise DLM Management and the processes involved.  

  2. The DLM Process: - Focuses on how the DLM system monitors a line and how it decides if any changes need to be made to your DLM profile.  We describe in detail each process carried out by the DLM System, what algorithms are used and how the decision is made whether any changes need to be made to your DLM Profile.

  3. The DLM Profiles: - Although the DLM system & process is the similar regardless of product, there are some slight differences between what parameters the system can configure depending on whether you have ADSL1, ADSL2+ or VDSL2.   The DLM profiles page breaks down the differences between the products and what configuration changes can be made for each type of xDSL. (Page not yet published.)

 

~ Monitoring the line

Over the course of a day information about the line will be recorded by the DSLAM's Data Collector.  The daily data file monitoring period is split up into 96 x 15 minute bins and each of those bins contain the following information about the line:

  • Data indicative of user activity.
    • Based on traffic counts of both upstream & downstream traffic. 
    • A non-zero traffic count is taken as indication that the line has had user activity. 
    • Zero traffic count indicates the line has not been in use.
  • Data indicative of instability
    • One or more resynchronisations
    • One or more errors caused by code violations - ES/SES
    • Failed initialisations
  • Connection Rate.

The Openreach monitoring period currently runs from 8pm to 8pm, although in 2019 a few lines may have been monitored using a temporal system where the monitoring period only runs during peak time.   Any errors outside of that time are ignored. As at 2021 the bulk of lines are still using 8-8 system.

~ Stability Levels

There are three levels of stability profiles which may be applied to a line.  The level of stability may have been chosen by the end-user, but usually the ISP will select a default profile on behalf of the user.  This stability preference affects how the DLM will react to any period of instability on the line.

Stability Level WBC Profile (20/21CN) NGA Profile (FTTC) Description
       
1. Aggressive Standard Speed Prioritise speed over stability for online gamers
2. Normal Stable Standard Best overall balance between speed and stability
3. Stable Super Stable Stable Prioritise stability over speed for IPTV
  Custom (SIN 472)   Allows a CP to specify the thresholds which DLM will manage the line towards.
There is some confusion over the naming & mapping of profiles between BTw and BToR - the latter of which does not allow for interleaving to be turned off for FTTC, despite mention being made in various documents. 
  • ISP's known to use NGA Speed Profile: AAISP, BT, Plusnet, Zen.
  • ISP's known to use NGA Standard Profile: EE, Sky, TalkTalk, Vodafone.

 

~ Trigger Events

The DLM will monitor the line for any changes in stability. Earlier DLM systems relied purely upon sync events but circa 2010 the BT DLM system was amended to introduce a new method of detecting retrains and also included error detection as a trigger event.

The events used by the DLM system are:

  • Total 24hr ES & SES
  • Total 24hr Full Initialisations
  • Total 24hr Failed Initialisations
  • Total 24hr Uptime
  • Total 24hr Unforced retrain count.

The events used by the RAP system are:

  • Line rate in previous 24hr period
  • Maximum line rate in previous 24hr period
  • Minimum line rate in previous 24hr period

The upstream and downstream are monitored independently.

 

~ Detection of sync events

Whilst the DSLAM is capable of detecting loss of synchronisation and most modern routers are capable of sending a dying gasp message to indicate when loss of sync was through a power failure, BT's DLM system does NOT take any notice of dying gasp messages when it comes to counting retrain events†.

A retrain event is detected by "a RADIUS transaction having occurred" - ie a new authentication event has been recorded on to the BTw network, which is part of the handshake process of synchronisation.

The DLM only counts 'forced' retrains and will disregard any resyncs detected as being an Unforced Retrain or one caused by a Wide Area Event.

An unforced retrain is one in which the user switches off or unplugs their modem for "a period of time greater than the minimum period of time" and that a minimum period of time prior to or after a resynchronisation has elapsed without the line automatically attempting, but failing to establish a connection.

Because we know that the DLM collects data bins every 15 mins and that it monitors traffic count to see if the line is in use, it is therefore recommended if possible to try to leave the router switched off for 30 mins to ensure that the DLM sees at least one complete period of inactivity prior to the resync. 

Algorithm: If a resync is detected in bin 'x' and bin ('x'-1) has > 0 seconds uptime then the DLM will class it as an unforced retrain.   If (retrain count in bin x > 0) && (uptime in bin (x-1) == 0) then it is assumed that retrain was caused by a user event and disregarded.

Note on Dying Gasp - Whilst DLM may not make use of the dying gasp message, nor is it mandatory for MCT; modem manufacturers are encouraged to implement it's use for Openreach's Test and Diagnostic systems.  This allows ISPs to check EUs have performed a power cycle of the modem prior to a potential engineer visit.  See SIN 498 Section 3.2.5 R.OAM.4.

 

 ~ Detection of Errors

The DLSAM's Data Collector records the amount of coding violations (errors) seen on your line, these figures are also displayed by some modem routers.
The type of coding violation that the DLM is interested in are Errored Seconds.  The DLM then normalises any errors to the total uptime in order to even out any burst periods: 

  • Mean Time Between Errors (MTBE) = Uptime / Errored Second Count.

The MTBE measures Errored Seconds and SES only (Not HEC, CRC or FEC).

Note.  Whilst the DSLAM system may record CRCs and FECs for other OSS purposes, there is only one code violation parameter recorded by the Element Manager used for the MTBE calculation. There are instances whereby if a line is performing particularly poorly, then RAMBO will undertake additional line monitoring direct with the DLSAM.  In such cases it monitors additional parameters such as SNR Margin which are not normally monitored at all for most lines.  For more information see: DLM System - Additional Line Monitoring.

 

~ Data Analysis [Step 1]

Each of the 96 bins are checked to see if there has been user activity and marked active or dormant. Any instability during inactive or dormant periods is ignored as the end-user will have been unlikely to have been affected by this.

Uptime is calculated from the active bins and any data indicative of instability during these periods is normalised.

Errors and resyncs are normalised to the uptime.  This is calculated by dividing the total time in seconds which the respective line has been in synchronisation and in active use over the past 24 hour period of the monitoring by the number of re-trains or errors recorded in that period.  The two algorithms used are:

  • MTBE (Mean Time Between Errors) = Connection uptime / Code Violations (Errors)
  • MTBR (Mean Time Between Retrains) = Connection uptime / No of retrains

This step is done by the element manager with data obtained from the Data Collectors and the information passed to the Management Device RAMBo each day.

As well as MTBE & MTBR line data, the element manager also produces an event data file which is used to monitor for Wide Area Events and forced retrains.  This event data is recorded as an array of each 15 min period in binary format [Uptime, Retrains, Errors]. For example a line which has uptime and errors but no retrains will record [1,0,1].

 

~ Preparing to Categorise the Line [Step 2]

1. Check for Wide Area Events

Each day, the DLM Management Device receives sets of data from the DSLAM's element manager. First it will analyse the event data from all lines to check for events such as thunderstorms which may have caused multiple lines to resync and/or generate lots of errors.

If a pre-determined percentage of lines experience retrains and errors in the the same time frame then any events occurring in that time frame will be classed as a Wide Area Event.

Documentation would suggest that the percentage values for wide area events are: >20% of users with uptime experienced a resync OR  >50% of users with uptime experiencing errors && >10% of users with uptime experienced a resync.

So attempting to put it in simple terms, if data in the binary file in any of the 15 min bins at the same time frame meets any one of the following two criteria:

  1. > 20% of bins are [1,1,1] OR [1,1,0]
  2. > 50% of bins are [1,0,1] AND >10% of bins are [1,1,1]||[1,1,0]

then a wide area event is declared for that period. Data from any bin within the corresponding time frame is discarded and not used for the DLM calculation.

2. Check for Unforced Retrains

An unforced retrain is when the End User has turned off or power downed the modem.  BT does not use the dying gasp, instead preferring to assume that an unforced retrain has occurred when the modem has remained powered down for 'x' period of time.

To check if the modem has remained powered down, it can use information from the event data file. If it detects that a line has retrained from any particular bin, then the preceding bin is checked to see if 0 was recorded for uptime. 

If the preceding bin had 0 uptime then it is assumed that the retrain was an unforced event and will be discounted by the DLM calculation.

3. Get Stability Level.

The Service Provider is identified and the Level of Stability selected for the line is obtained. 

 

 ~ Categorising the Line - ILQ Indicative Line Quality. [Step 3]

Using the MTBE & MTBR data, the Management Device will categorise the line using the relevant stability level metrics.  Either one of MTBE or BTBR data can trigger the DLM to apply a (further) step to increase line stability.

Below are tables showing the MTBR and MTBE thresholds for each Stability Option*

WBC ADSL/ADSL2+ Line Categorisation Thresholds
Stability Option MTBR red threshold MTBR green threshold MTBE red threshold MTBE green threshold
Standard 8,640 16,800 5 250
Stable 16,800 33,600 300 6,000
Super Stable 33,600 67,200 3,600 60,000
wef Apr 2014

NGA FTTC Line Categorisation Thresholds
Stability Option Retrain threshold MTBR green threshold MTBE red threshold MTBE green threshold
Speed 20 8400 30 300
Standard 10 16800 180 600
Stable 5 33600 360 3600
wef Jun 2012
MTBR update 5/15
MTBE update 6/21

Using the above thresholds there are are four possible categories for which the line may be classified as:- Very Poor, Poor, OK and Good.

Example

Standard DLM Algorithm for line categorisation
Stability Metric Good   per
day
OK
  per
day
Poor Very Poor
GEA
Speed
Retrains mtbr≥8400   10 >4200 && <8400   20 mtbr<4200 >10 per hour
Errors mtbe≥300   288 >30 && <300   2880 mtbe<30  
GEA
Standard
Retrains mtbr≥16800   5 >8400 && <16800   10 mtbr<8400 >10 per hour
Errors mtbe≥600   144 >180 && <600   480 mtbe<180  
GEA
Stable
Retrains mtbr≥33600   2 >16800 && <33600   5 mtbr<16800  
Errors mtbe≥3600   24 >360 && <3600   240 mtbe<360  
BTw Aggressive
(Standard)
Retrains mtbr≥16800 4.66 hr 5 >8640 && <16800 2.4hr 10 mtb< 8640 >10 per hour
Errors mtbe≥250 4.16 m 345 >5 && <250 5 s 17280 mtb<5  
BTw Normal
(Stable)
Retrains mtbr≥33600 9.33 hr 2.5 >16800 && <33600 4.66hr 5 mtb<16800 >10 per hour
Errors mtbe≥6000 1.66 hr 14 >300 && <6000 5 m 288 mtb<300  
BTw Stable
(Super Stable)
Retrains mtbr≥67200 18.6 hr 1 >33600 && <67200 9.33 hr 2.5 mtb<33600 >10 per hour
Errors mtbe≥60000 16.6 hr 1.4 >3600 && <60000 1 hr 24 mtb<3600  

Two examples:

1). If a line is operating at Standard Stability and the average time between retrains over the day is less than once every 8640 seconds (2.4 hrs) - which equates to more than 10 per day OR if the average time between errors is less than 1 per 3 seconds of active uptime - (>17280 per day) then the line would be classified as poor.

2). If a line is operating at Standard Stability and the average time between retrains over the day is more than once every 16800 seconds (4.6 hrs) - which equates to more than 4 per day AND if the average time between errors is more than 1 per 250 seconds of active uptime - (<345 per day) then the line would be classified as good.

When the line category has been obtained, then DLM system will move on to the next step to check if any changes to the DLM profile needs to be made.

*These figures are provided in good faith and may not necessarily be the most up to date.

 

~ Making Changes to the DLM Profile.

When a line has been categorised, the Management device checks to see if any changes to the DLM profile needs to be applied.

DLM Action Status
Line Classification ILQ Status Action
     
Good - Performing beyond expectations Green Check if can remove/reduce any of the DLM parameters. 
Performing within acceptable params Amber No changes will be made to the DLM profile
Poor MBTR/MTBE Red The system will apply a further DLM step to increase stability.
Poor MBTE upstream Crimson The system will apply a further DLM step to increase stability.
Rapid Retrains Scarlet The system will undertake additional line monitoring so that immediate changes to profiles may be made.
No DLM data Grey No action.
Insufficient DLM data Black Days uptime was less than 15 mins. No action.

Up until this point, the DLM process for ADSL1, ADSL2+ and FTTC are very similar. Any changes the DLM system makes now depends on the product type as each of these have different parameters which may be adjusted.

The individual product parameters will be discussed in more depth on a separate page but a summary is shown below:

Parameters which may be adjusted by the DLM
  SNR Margin Interleaving INP Capping/Banding
         
ADSL1 Yes - 6-15 dB ON/OFF NO Extreme circumstances
  Example profile:     on   9   6   off
ADSL2+ Yes - 3-15 dB OFF/Low/Med delay INP - 0/1/2 Yes.  UC = Uncapped
  WBC 160K - 24M Medium delay (INP 1) 15dB Downstream, UC Medium delay (INP 2) 6dB Upstream (ADSL2+)
FTTC No - Fixed 6dB OFF/Low/High G.INP - Yes
  0.128M-10M Downstream, Retransmission Low - 0.128M-1.3M Upstream, Error Protection Off

 

 ~ Removal of DLM intervention - Reversal of Interleaving & Error Protection steps.

Unfortunately very little is known about this part of the process.  What we do know is that the line must be acheiving ILQ green status for 'x' period of time.  The period of time varies and it is deliberate to ensure that a line doesn't flap between profiles. There has been mention of a 'doubler' method and although this would also make sense from what we have observed, there is no hard evidence that this is fact.

DLM is usually quite forgiving for a first time intervention and the line will go down one step after a full day of stability.  This fact has been bourne out by many users on our forums many times over.  Ive even experienced it myself first hand several times.  

  • Case One:  Testing a new router for a manufacturer. Firmware had a bug with the bitswap process and caused the line to have interleaving applied.   Router was swapped out and interleaving removed after full day of MTBE green.
  • Case Two: Day 1 - Line fault caused massive amount of Errored Seconds. Day 2 - DLM applied interleaving, but high Err/Secs still continued.  Day 3 - DLM applied INP, Err/Secs continue.  Day 4 - DLM applied more interleaving. Err Secs continued to exceed MTBE red.  Day5 -DLM increased INP but fault found at remote location and was fixed. Day 6 - Line stable no ErrSecs.  Day 7 - DLM reduces INP.  Day 8 - DLM reduces amount of interleaving. Day 8 - DLM totally removes INP. Day 9 - DLM totally removes interleaving.  

It would appear to use some sort of doubler method for each indidual intervention, so the more times a line sees DLM action, then the longer it takes for it to be removed.

  • Example:  Day 1 - Line ILQ exceeds MTBE red. Day 2 - DLM applies interleaving. Day 3 - No errors MTBE Green. Day 4. DLM removes interleaving, but line immediately starts to see errors and goes MTBE red. Day 4 - DLM reapplies interleaving.  ILQ status Amber.  Day 5 - ILQ status green.  Day 6 - DLM takes no action and waits further period. It is at this stage where additional line monitoring is possibly performed to ensure other line parameters such as SNRm is reasonably stable before making the decission to remove interleaving.

 

 ~ DLM reset

With adsl/adsl2+ products, it is possible for the ISP to reset the DLM.
For NGA products (Fibre) then the ISP cannot perform a reset and this can only be done by a BT Openreach engineer after clearance of a line fault.
Update 2019 - ISP can request DLM reset from Openreach without having to call out an engineer but only if DLM appears stuck and the linehas been deemed stable for a suitable period of time.

 

 ~ Note by author

This page has been compiled after months of research and countless hours reading all available information about BT's DLM.
Things ground to a halt just prior to the ASSIA court case and since then it has been nigh on impossible to get any new information about the BT DLM or changes made since that date.  The ASSIA court case appears to center around BT's ILQ system and the process steps and decisions undertaken to change and reverse DLM steps, which is why there is little information about this stage.  
I had hoped that in time, new information would come to light, but 8 months later still nothing. Although this page has been here for a while pending an update, it has been requested several times that I publish what information I do have.  All pieces are not there when it comes to the ILQ, but afaik the data analysis steps still remain exactly the same and hence publication now. If more information does ever become available then I will update.

©kitz 2014
last update Jun 2015
Updated 6/21. New DLM params

 

 
Copyright © Kitz 2003-
All rights reserved
Unauthorised reproduction prohibited
Valid HTML 4.01 Transitional adsl 60 spacer


|| Broadband || ISPs || Tech || Routers || Site || Wiki || Forum ||

| About | Privacy Policy |

adsl 60 spacer Valid CSS!