Video DialTone, My Second Look.

G. D. Purdy

Copyright © March 2003

Abstract

This article briefly presents a method first proposed in December 1994 for implementing a video dialtone service, and then takes another look at the proposal from the standpoint of current technological improvements.


Introduction

Beginning in December of 1994, I suggested to various ranking telephone company executives a means whereby on-demand video and interactive services over ordinary telephone lines could be quickly made available.   The mainstay of this approach was to implement these services primarily by adding more of the telco hardware already in use for regular service and modifying existing software for the new services.   As DSL (Digital Subscriber Line) was essentially still in the future, I suggested that a user would dial into a Local Access And Bridging node (LAAB), which would route the user to whatever services were selected.   The LAAB was to be a telco frame, i.e., standard hardware, and the software driving it was to be the standard telco directory assistance software with screens modified for use by the subscriber and masking the actual telephone numbers.   Thus, menu-selection would actually be call-connection in disguise that, in addition to accessing the LAAB's local services, would provide bridging to nodes providing other services such as Picture Phone™, the Internet, library, or computational services including, but not limited to, games.

One advantage to this method, noted at the time, was that virtually all possible telephone numbers in use in the U.S. and Canada would become available for reuse one or more times within the video dialtone system.   Where satellite and cable providers spoke of providing 100, 200, 500 channels, this method would have the capability of providing millions of channels.   How this would work is that a telephone number can mean whatever is needed within any individual telco frame once it is disassociated from a specific telephone line.   (Thus, in 1994, without noticing it, I created the concept of the virtual telephone number; that I had done so was only recently brought to my attention.)

This approach to video dialtone was inspired in part by the article in the September 1994 issue of the SMPTE Journal, "Delivery of TV Over Existing Phone Lines," by Peter F. Prunty.[4]   I relied heavily on this article then, and still do in taking this additional "look."   Prunty described therein "a system for delivering compressed digital TV over existing phone lines concurrent with conventional phone service . . ." using ". . . Motion Picture Experts Group (MPEG) compression and . . . carrierless amplitude phase modulation (CAP)."   Prunty did not address video product selection in his article, so I did it.

I made all of my suggestions, all relating to selection and delivery of video dialtone services, within the space of about six months, and then went on to other things.   Recent events, including taking a graduate-level computer networks course (Fall 2002), a comment made to me to the effect that the proposal was too early in terms of what was doable in 1994, and being a recipient of Verizon's current push to get their customers to sign up for DSL, have brought my attention back to video dialtone and have inspired the "second look."

This article is as much for a general audience as it is for the "techie."   Those who wish are invited to skip over the "over obvious."


Delivering The TV Signal, Then And Now

In 1994, Prunty selected CAP as the means of delivering the TV signal over existing, twisted-pair wire, phone lines.   In support, he charted theoretical transmission capacity and actual CAP capacity, citing and using in part the works of Claude Shannon[5] and Henry Nyquist[3].   I suspect it was this charting, plus the stated need to get sufficient compression to not exceed available bandwidth by degrading video where necessary, that evoked the comment that the proposal was too early.

I am not quite ready to concede that the proposal was too early as I believe that an alternate application of CAP may have brought sufficiently improved transmission efficiency.   While Prunty does not say it in so many words, he implies a streaming service, a continuous stream of signal, "Underflow must also be avoided, so that all the data received . . . is valid data."[4]   I read this to mean that some parts of the signal were being "padded" to assure that the line's bandwidth was being kept "full."   What if, instead of padding, "controlled streaming" were implemented?   What if the receiver maintained a "circular" data buffer that would receive the unpadded TV signal stream as it arrived, releasing it at the appropriate time in accord with the "timing marks" already built into the stream and, as the buffer came close to being full, sent commands back to the provider that would "pause" the stream for a short period of time?   This method, a variation on that used in some computer networks today, would have allowed the signal to get ahead of the display rate when compression was naturally high, and the circular buffer could then be "drawn down" when compression was low enough that the signal would otherwise not arrive in time to display.

I am also not quite ready to insist that my alternate application of CAP would have improved delivery enough to make the result acceptable.   This could only have been determined by using a test-bed for experiments and, with the telcos choosing to provide DSL, that part of the argument has become moot, at least in regards to CAP.

At first glance, DSL does not appear to offer any advantage over CAP, but it is what the telcos are offering so, unless something changes, we may well be stuck with it.   To quote Computer Networks, Andrew S. Tanenbaum[6], "Typically, providers offer 512 kbps downstream and 64 kbps upstream (standard service) and 1 Mbps downstream and 256 kbps upstream (premium service)."   In fact, a visit to Verizon.com[1] shows a wide range of service speeds**.   A test-bed could be used to determine with what speeds video dialtone would be practical for what classes of video, and to determine how the buffering with (possibly-)controlled streaming technique would preserve video quality were it implemented.

(**Note added May 2004  -  Verizon has modified its website since this article was written.   The speeds available for residential service are no longer cited, and the speeds available for business service are harder to find.   On the plus side, Verizon has been very busy installing fiber-optic "to the curb," so the time is coming when all of the speeds originally cited will no longer apply.)

Despite the implications of the above paragraph, I am not ruling out the implementation of some alternate to DSL. While this discussion continues as if the delivery system uses DSL, kindly keep in mind that the points discussed quite likely apply to any transmission method.

Generally speaking, the bandwidth possible on a DSL line is in an approximately inverse logarithmic proportion to the twisted-wire length portion of the circuit; as an alternate to "twisted-wire length," one might say "distance between the local telco office and a potential subscriber."   Thus, when a telco selects "a speed to offer, it is simultaneously picking a radius from its end offices beyond which the service can not be offered."[6]   At the same time, it is artificially limiting the speed possible for a subscriber who is nearer to the end office as, over short distances, speeds of as high as 8 Mbps are possible.   As of this writing, Verizon offers up to 1.5 Mbps downstream for residential and up to 7.1 Mbps downstream for businesses[1].

Rob Koenen's article, "Object-based MPEG Offers Flexibility," in EE Times[2] at http://www.eetimes.com/story/OEG20011112S0042 is generally interesting but not all that relevant to our discussion except that it does have a chart a third of the way through the article that indicates the bandwidth needed for a TV signal using MPEG1 compression, which is what Prunty specified, and which is treated in the next section.   Note that, according to the chart in Koenen's article, 8 Mbps is about four times the maximum needed for a MPEG1-compressed TV signal.

Fortunately, there may be ways to "cheat," at least for the benefit of some DSL subscribers.   In a sense, the telephone industry "cheated" in implementing the 56 kbps modem for data transmission using the ordinary voice-grade circuit.   What the modem actually does is to continuously sense the condition of the circuit and, in conjunction with the modem on the other end of the circuit, varies the baud rate based on what the circuit will bear; it also "steals" some upstream bandwidth for downstream use.   As some users have noted, the 56 kbps modem does not always provide the full 56 kbps and, not so well known, the modem may occasionally exceed 56 kbps.   Enhancing the DSL modem through the application of similar "tricks," let's call it "enhanced DSL," could allow higher effective speeds for subscribers whose lines can handle it, and allow service for those would-be subscribers who are currently being excluded.

Enhanced DSL would also provide for some acceptable on-demand video services for those subscribers with slower DSL.   How this can be done is discussed further in the next three sections.


Compressing The TV Signal

In 1994, Prunty proposed using MPEG1 to compress the TV signal, and a glance at the EE Times chart[2] shows that MPEG2 would not be a suitable replacement for MPEG1 until DSL subscribers get fiber all the way into their premises.   This discussion will stick with MPEG1.

MPEG1 provides for four different picture types.   One type, the "D-picture," is highly compressed but is used solely to facilitate fast forward and/or reverse searching, and is mentioned here only to get it out of the way.   It is not used for the compression that concerns us.

Of the other three picture types, the "I-picture," the "B-picture," and the "P-picture," the I-pictures are essentially JPEG pictures, each representing an entire scene at a certain point in time, and do not provide anything like the compression needed.   The B-pictures and P-pictures are interleaved between the I-pictures, and it is they that provide most of the needed compression of the video stream.

B-pictures and P-pictures are not complete pictures; rather, at some very close points in time, they modify 16 pel x 16 pel blocks of the existing I-picture that they have followed so as to represent the next variation-with-time of the scene to be displayed.   An I-picture would be displayed, modified by some number of B-pictures and/or P-pictures and redisplayed, and then be modified and redisplayed some number of times again using other separate groups of B-pictures and/or P-pictures for each redisplay.

It may be apparent at this point that the obtainable compression is directly proportional to the lack of activity shown in a scene.   Depicting more and more changes to the scene with each redisplay requires a progressively larger group of B-pictures and/or P-pictures for each redisplay.   In other words, as the scene changes more on each redisplay, less compression is possible.   This means that a typical soap opera of the type usually shown on afternoon TV, one where the actors tend to pose and recite "pronouncements," is much more compressible than something with far more frequent scene-changes such as a basketball game where ten players plus referees are running about in front of a background consisting in great part of an audience that also contributes motion.   As the MPEG1 video coding range goes as low as 0.3 Mbps[2], "substandard" DSL, while not at all suitable for a basketball game, could be expected to depict a soap episode very well indeed, and we surely know better than to ignore the revenue-stream that the soap fan-base would generate.


When The TV Signal Overwhelms Bandwidth

Subscribers will certainly try to view video that is not compressed enough for their bandwidth, and probably will try to view that which is far beyond anything recoverable through buffering and controlled streaming at that.   All is not lost, however, as there are still steps, depending upon whether the stream is "live" or pre-recorded, that can be taken to improve the result that would be presented.

The probably-best solution for pre-recorded TV is simple, and is to be applied solely in the subscriber's equipment.   If the next version of the scene is not ready to display on time, simply continue to display the current version until the new version is ready.   The result will be a slowing of the activity depicted, but that might not be noticed and may well be bearable if it is noticed.

An alternate method, on which some work has been done, is to "predict" what the changes to the picture will be, display that, and discard the replaced data when it finally does arrive.   The problems with this alternate are two-fold; the data following the discarded data is most likely also going to be late, and the prediction could be wrong.

Live TV requires a different approach as the action being depicted takes place before cameras and can not in itself be slowed.   The answers in this case (there are two possibilities that are not mutually exclusive) cannot be applied at the subscribers' end but rather at the telco end office or higher, and implementation will be more complicated for either approach.

One way is to have the stream keep up with the live action.   To do this, some of the stream will have to be discarded rather than sent through the bottle-neck, the subscriber's twisted-pair wire.   Something like this is sometimes done in computer networks to reduce congestion anyway, so it is not as if we are breaking any really new ground here.   If the discard is not severe, it may not be noticed; more severe discard will result in "jitter" or worse.

A circular buffer would be needed at the telco end of the twisted-pair for this.   As the buffer became close to being full, B-pictures and P-pictures immediately preceding the next I-picture would be discarded rather than being sent to the subscriber; if needed, all B-pictures and P-pictures between a pair of I-pictures would be discarded; if needed, the first I-picture of an I-picture pair with all intermediate B-pictures and P-pictures discarded would itself be discarded.   Following this method, whole blocks of signal could be discarded as long as one rule is followed, that no I-picture is to be discarded unless all of its following B-pictures and P-pictures are also discarded (which I understand is done in ATM).   Discarding signal would save the transmission time through the bottle-neck that would otherwise be used and the remaining signal would continue to arrive approximately on time.

The other method, which has advantages beyond just feed-control, is to have live TV go directly to a file and to then be able to be immediately transmitted from the file.   For the subscriber with a DSL connection that is able to keep up with the data stream, the result is real-time feed because the data is made available to transmit as soon as it in placed at the end of the file.   The subscriber with slower DSL, while starting equal with his favored kin, would fall behind, slowly or quickly, depending on line deficiency, but even that can be alleviated by judicious real-time editing of feeds.

It should be noted that the suggested use of the "circular buffer" was based on the personal computer capabilities of 1994.   While I have so far kept the term for consistency, CPUs today are much faster, and the discs are not simply faster but have much greater capacities.   In the typical environment today, the "buffer" can actually be a disc file capable of containing a video stream that is hours long, and such capacity can eliminate much, if not all, of the need for the streaming to be "controlled."

Finally, it should also be noted that the telcos' efforts to extend fiber-optics to the curb and even beyond, thus providing the ability to radically improve bandwidth, will eventually make moot many of the considerations discussed in these last three sections.


Back To The LAAB

My original proposal suggested that a user would dial into a LAAB to access video dialtone services.   With current DSL offerings, video services are providable but, as the DSL circuit is "always on," the term "dialtone" might not have anything to do with it anymore.   Let's see if we still have use for the LAAB and, to determine part of where and if the LAAB would provide advantage, we need to understand how internet and telephone communication methods differ.

Originally, all telephone calls were physical connections.   The user picked up a telephone, an operator responded, the user recited the person - later the number - wanted, and the call was routed by placing plugs into a switchboard.   As the years passed, the telco frame took the place of the operator's switchboard and the connection became virtual rather than physical as ways to route multiple calls over the same circuit were implemented.   What remained constant was that, whether the connection was physical or virtual, the telco was not allowing a call to complete unless it could route the entire call and guarantee sufficient bandwidth for the voice-grade connection, i.e., unless the "connection" could be set up.   This type of service is called circuit switching.

It is worth noting that circuit switching also provides that all information, data or voice, is routed the same way, and arrives in the order sent and without delay.

Internet communication, by comparison, uses packet switching; no telco-type connection is established between communication end-points.   As data is sent over the Internet, parts of the data may be routed through different nodes and encounter different delays.   It is possible for parts of the data to arrive in an order that is different from the one in which they are sent, and residential users may have noticed the delay aspect, particularly when waiting for the out-of-order piece of data.

TV signal delivery will tolerate neither waiting for signal parts that are in the wrong order nor other delays very well.   Therefore, assuring on-time delivery of the TV signal suggests circuit switching, and so, for this reason alone, the LAAB is still best for at least the streaming services.

The original reason for introducing the concept of the LAAB was to quickly implement a product selection environment that would be easy to scale, and the U.S. telephone system, using the telco frame, is infinitely scalable and extremely reliable.   Millions of people use it every day, a goodly proportion thereof using computerized directory assistance, DAS/C, and a good proportion of those electing to have the call automatically dialed.   The LAAB with associated nodes, driven by DAS/C extended to the subscriber, would take advantage of what already works and could also facilitate billing.   Is there any better way of implementing product selection than by using something that is already working?

Internet communication, via DSL services or other, requires that the subscribers' equipment have the appropriate modems or equivalents, plus supporting software.   Modems modulate and demodulate the digital signal/binary signal/bit stream that is sent over the subscribers' lines, which they do very well, but that is not the entire issue.   The things that modems and their driving software do in the process of sending and receiving are divided into sections (we do not want to get too detailed here) called "layers."   At the bottom, actually "talking to the line," is what is known as the physical layer, with the data link layer directly above it; the hardware of the modem provides the physical layer plus "stray" elements of the data link layer that have been migrated into it.   Above these two layers are other software layers.   The nice thing about a layer is the function that it performs; the not so nice thing is that the layer adds bits to the bit stream for framing, error correction, or whatever to perform the communication function, and this exacts a price in bandwidth because these "extra" bits must also travel over the twisted pair.   Plugging the subscribers' DSL into a LAAB makes possible stripping some of the layers (not the physical or data link) out of the subscribers' equipment and placing them at the LAAB end of the circuit, thus reducing the overhead where it hurts the most, through the subscribers' twisted-pair.   For non-internet services, particularly the streaming services, some of these layers could even be eliminated to good effect, but note that we have just changed the definition of the LAAB away from that of a standard telco frame.

By having a LAAB, the telcos and others are able to provide the subscriber with non-Internet services that are more secure than those using the Internet because such communication is much less subject to interception and disruption.   For example, such communication could not be intercepted by duplicating an IP address, would be more protected from viruses, and would be immune to denial of service attacks.

It is worth saying again that the LAAB with associated nodes has the capability of providing millions of channels rather that the hundreds currently touted by satellite and cable providers, but it has not escaped notice that one or more of these providers could also use a LAAB-with-associated-nodes selection and delivery system, or its equivalent, to provide the same result.

There is still one more factor favoring the LAAB.   The FCC wants telcos to provide access to all would-be providers; the LAAB, and the nodes that it would connect to, would be good for that too.


Hardware For DSL TV

The hardware needed for TV hookup to the DSL depends upon where one starts.   I know people who already own personal computers, who want DSL with TV, and who want to view TV using the computer monitor.   Having large-screen computer monitors, or knowing that they are available, they feel that they would need only the one screen for everything.   For those who want more, consider that, some years ago, I assisted former Dumont Labs pioneering TV engineer Lawrence "Loss" Litchfield as he attached a standard large-screen television set to a personal computer in order show an audience what was displayed on the computer's screen.   Loss used an off-the-shelf interface then, the setup took less than five minutes, and your neighbor's eight-year-old could have done it.

The short answer here is that the hardware needed depends on what the subscriber wants.   For someone already with suitable DSL who is happy with what their computer provides, nothing more is needed.   Those who want more and are willing to pay for it can simply go out and buy it, screen, speakers, or whatever.   Those who do not yet have DSL start by getting DSL and then adding whatever else they want.


Software For DSL TV

The software issue is even easier than the hardware issue.   There are, as far as I know, only two general kinds of personal computers; there are those built by Apple, Inc., and there are those that are not, and even those that are not use a very limited number of operating systems.   This would seem to suggest only a few programs, one for each type of machine/operating-system combination, and the subscriber would kickoff an automated download of the appropriate program over the DSL.


Conclusion

TV, and other similar services, over ordinary phone lines are doable, and there are steps, some already in progress, that can be taken to maximize delivery.




Bibliography

[1]   Anonymous: Verizon.com

[2]   Koenen, Rob: "Object-based MPEG Offers Flexibility," EE Times, November 12, 2001.

[3]   Nyquist, Henry: "Certain factors Affecting Telegraph Speed," Bell System Technical Journal, April 1924.

[4]   Prunty, Peter F.: "Delivery of TV Over Existing Phone Lines," STMPE Journal, September 1994.

[5]   Shannon, Claude: "A Mathematical Theory of Communication," Bell System Technical Journal, July 1948; also October 1948.

[6]   Tanenbaum, Andrew S.: Computer Networks, Upper Saddle River, NJ: Prentice Hall, 2003.