Multimedia file stats in Perl?

Multimedia file stats in Perl?

am 07.11.2006 01:09:14 von Ilya Zakharevich

Each day the MP3::Tag module I maintain *looks* more and more like a
module to deal with arbitrary multimedia types... At least a lot of APIs
make sense for many other multimedia types; and the scripts supplied
with the module work mostly via these API.

But to actually address this similarity, I need (as a minimum) a way
to inspect a multimedia file, and get some stats about it.

The stats which come to my mind are

a) does it contain audio, video, stills, slideshows (or combinations
thereof)?

b) What is the duration, number of frames, framerate, size of (the
first?) image frame?

c) which module to use to get the embedded metadata?

d) what is the imbedded metadata?

Are there ready-to-use Perl solutions to these problems?

Thanks,
Ilya

P.S. The other (meta)questions is: which questions does it make sense
to ask about a given multimedia file if you do not care what is
the "format" of the file. This metaquestion should definitely
have been addressed by any multimedia architecture which
delegates the actual work to installable multimedia codecs.

Anyone knowing about this?

Re: Multimedia file stats in Perl?

am 07.11.2006 17:58:06 von Ted Zlatanov

On 7 Nov 2006, nospam-abuse@ilyaz.org wrote:

> Each day the MP3::Tag module I maintain *looks* more and more like a
> module to deal with arbitrary multimedia types... At least a lot of APIs
> make sense for many other multimedia types; and the scripts supplied
> with the module work mostly via these API.
>
> But to actually address this similarity, I need (as a minimum) a way
> to inspect a multimedia file, and get some stats about it.
>
> The stats which come to my mind are
>
> a) does it contain audio, video, stills, slideshows (or combinations
> thereof)?
>
> b) What is the duration, number of frames, framerate, size of (the
> first?) image frame?
>
> c) which module to use to get the embedded metadata?
>
> d) what is the imbedded metadata?
>
> Are there ready-to-use Perl solutions to these problems?
>
> Thanks,
> Ilya
>
> P.S. The other (meta)questions is: which questions does it make sense
> to ask about a given multimedia file if you do not care what is
> the "format" of the file. This metaquestion should definitely
> have been addressed by any multimedia architecture which
> delegates the actual work to installable multimedia codecs.
>
> Anyone knowing about this?

I've seen this solved with mplayer, which can just report information
about the file. I think this is worth consideration, even though it's
not pure Perl, because:

1) users may install codecs for mplayer in the future, why not use
them?

2) it's not hard to install mplayer

So maybe the module could provide an mplayer-driven investigation into
the file, in addition to the pure Perl approach.

I don't know about the metadata design, sorry. Maybe ID3v2 already is
sufficient, or there's something similar for video? I think there are
so many video codecs that there's just no unifying tag format yet.

Ted

Re: Multimedia file stats in Perl?

am 08.11.2006 05:33:55 von Ilya Zakharevich

[A complimentary Cc of this posting was sent to
Ted Zlatanov
], who wrote in article :
> I've seen this solved with mplayer, which can just report information
> about the file. I think this is worth consideration, even though it's
> not pure Perl, because:
>
> 1) users may install codecs for mplayer in the future, why not use
> them?

Since most users won't install mplayer, why bother?

> 2) it's not hard to install mplayer

Correct. However (in my experience), after installation, there is
little chance that it will work correct...

[In part, THIS THREAD is partially due to my bad experience with
mplayer. ;-) :-[]

> So maybe the module could provide an mplayer-driven investigation into
> the file, in addition to the pure Perl approach.

"In addition" is very fine with me. But I would prefer to have "pure
Perl approach" first. ;-)

Unfortunately, I have no idea where to start... I would prefer to
reuse existing modules than to write everything myself...

> I don't know about the metadata design, sorry. Maybe ID3v2 already is
> sufficient, or there's something similar for video?

There must be something like this for Matreoshka; additionally, OGG
has some (brain-dead) metadata storage...

> I think there are so many video codecs that there's just no unifying
> tag format yet.

My current approach is make MP3::Tag to look into file name_of_mm.id3
if it is asked to find metadata in name_of_mm.foo, and it could not
find it there. This will "solve" the problem with formats which do
not have metadata slots. However, it would be nice to have formats
WITH metadata slots accessible too - preferebly via type-transparent
interface.

Thanks,
Ilya

Re: Multimedia file stats in Perl?

am 08.11.2006 16:43:05 von rvtol+news

Ilya Zakharevich schreef:

> which questions does it make sense
> to ask about a given multimedia file if you do not care what is
> the "format" of the file. This metaquestion should definitely
> have been addressed by any multimedia architecture which
> delegates the actual work to installable multimedia codecs.


Maybe attributes like Title and Author and such.

--
Affijn, Ruud

"Gewoon is een tijger."

Re: Multimedia file stats in Perl?

am 08.11.2006 17:11:12 von John Bokma

Ilya Zakharevich wrote:

> Each day the MP3::Tag module I maintain *looks* more and more like a
> module to deal with arbitrary multimedia types... At least a lot of
> APIs make sense for many other multimedia types; and the scripts
> supplied with the module work mostly via these API.
>
> But to actually address this similarity, I need (as a minimum) a way
> to inspect a multimedia file, and get some stats about it.
>
> The stats which come to my mind are
>
> a) does it contain audio, video, stills, slideshows (or combinations
> thereof)?
>
> b) What is the duration, number of frames, framerate, size of (the
> first?) image frame?
>
> c) which module to use to get the embedded metadata?
>
> d) what is the imbedded metadata?
>
> Are there ready-to-use Perl solutions to these problems?
>
> Thanks,
> Ilya
>
> P.S. The other (meta)questions is: which questions does it make sense
> to ask about a given multimedia file if you do not care what is
> the "format" of the file. This metaquestion should definitely
> have been addressed by any multimedia architecture which
> delegates the actual work to installable multimedia codecs.
>
> Anyone knowing about this?

For stills, look into EXIF, things I want to know are: which device was
used, settings, copyright information, title, author, date, time, etc.

Probably you want to normalize field names, and create a wrapper that is
able to offer meta data in raw format, and normalized.

$meta_info->get_field_names();
$meta_info->get_field_names_raw();
$meta_info->get_value_for( 'title' );
$meta_info->get_value_for_raw( 'title' );

Something like the above :-)

--
John Experienced Perl programmer: http://castleamber.com/

Perl help, tutorials, and examples: http://johnbokma.com/perl/

Re: Multimedia file stats in Perl?

am 10.11.2006 18:01:51 von Ted Zlatanov

On 8 Nov 2006, nospam-abuse@ilyaz.org wrote:

> ], who wrote in article :
>> I've seen this solved with mplayer, which can just report information
>> about the file. I think this is worth consideration, even though it's
>> not pure Perl, because:
>>
>> 1) users may install codecs for mplayer in the future, why not use
>> them?
>
> Since most users won't install mplayer, why bother?

They will probably install some movie player. Maybe we should use
Xine, VLC, etc. if they are available (and those players themselves
may use portable libraries that do what you want).

>> 2) it's not hard to install mplayer
>
> Correct. However (in my experience), after installation, there is
> little chance that it will work correct...
>
> [In part, THIS THREAD is partially due to my bad experience with
> mplayer. ;-) :-[]

OK. I don't share your experience but sure, not everyone is happy
with any given package.

>> So maybe the module could provide an mplayer-driven investigation into
>> the file, in addition to the pure Perl approach.
>
> "In addition" is very fine with me. But I would prefer to have "pure
> Perl approach" first. ;-)
>
> Unfortunately, I have no idea where to start... I would prefer to
> reuse existing modules than to write everything myself...

I think something portable must already exist. See above about
existing players.

>> I don't know about the metadata design, sorry. Maybe ID3v2 already is
>> sufficient, or there's something similar for video?
>
> There must be something like this for Matreoshka; additionally, OGG
> has some (brain-dead) metadata storage...

I don't know of Matroshka metadata. It's not a lovely format in any
case, but very popular nevertheless.

>> I think there are so many video codecs that there's just no unifying
>> tag format yet.
>
> My current approach is make MP3::Tag to look into file name_of_mm.id3
> if it is asked to find metadata in name_of_mm.foo, and it could not
> find it there. This will "solve" the problem with formats which do
> not have metadata slots. However, it would be nice to have formats
> WITH metadata slots accessible too - preferebly via type-transparent
> interface.

OK, so does anyone know for sure about:

- Matroshka
- WMV
- VOB

metadata? I am not an expert on any of them.

Ted

Re: Multimedia file stats in Perl?

am 11.11.2006 05:59:03 von Ilya Zakharevich

[A complimentary Cc of this posting was sent to
Ted Zlatanov
], who wrote in article :
> >> 1) users may install codecs for mplayer in the future, why not use
> >> them?
> >
> > Since most users won't install mplayer, why bother?
>
> They will probably install some movie player.

Why would they? Most computers have a player already installed for
them, and AFAI suspect, most do not allow querying from outside...

Thanks,
Ilya

Re: Multimedia file stats in Perl?

am 11.11.2006 10:49:48 von rvtol+news

Ilya Zakharevich schreef:
> Ted Zlatanov:

>>>> 1) users may install codecs for mplayer in the future, why not use
>>>> them?
>>>
>>> Since most users won't install mplayer, why bother?
>>
>> They will probably install some movie player.
>
> Why would they? Most computers have a player already installed for
> them, and AFAI suspect, most do not allow querying from outside...

What kind of querying, at what moment?

flvlength: http://www.kaourantin.net/source/flvlength.pl

--
Affijn, Ruud

"Gewoon is een tijger."

Re: Multimedia file stats in Perl?

am 12.11.2006 09:54:46 von Ilya Zakharevich

[A complimentary Cc of this posting was NOT [per weedlist] sent to
Dr.Ruud
], who wrote in article :
> What kind of querying,

THAT'S the big question in my initial post...

> at what moment?

??? You mean "now"?

Now I know what makes sense for MP3, and what I want from the
multimedia files I CURRENTLY work with:

x) (envelop) format ("MPEG4 inside AVI");
x) subtype of (envelop) format (layer III [of MPEG1] in ODML'ed AVI);
x) Has audio/stills/video/time-delayed-actions (slideshow?)?
x) "Attributes" of each component (compressed audio? interlaced video? etc)
x) Time duration (of each "component"?);
x) Number of audio/video/still "frames"/pages;
x) Framerate (of each "component");
x) Byterate (of each recognized "component", and total);
x) "Total filesize" of each component;
x) Pixel size of (the first? largest?) image/video frame;
x) PPI value for (the first?) image/video frame;
x) Number of layers in a (video) frame? (# of Languages in audio?)
x) Number of channels in each component (gray+alpha? RGB? Stereo? 7+1?)
x) Bitdepth of a sample in each component/layer/channel;
x) Gamma/color-profile/color-space?
x) Has a "preview"/menu embedded?

This is at least what I remember now...

> flvlength: http://www.kaourantin.net/source/flvlength.pl

Sorry, it is not clear what it does. Does it just count video frames
by reading frames one-by-one, or what?

Thanks,
Ilya

Re: Multimedia file stats in Perl?

am 12.11.2006 13:48:36 von rvtol+news

Ilya Zakharevich schreef:
> Dr.Ruud:

>> What kind of querying,
>
> THAT'S the big question in my initial post...
>
>> at what moment?
>
> ??? You mean "now"?

I mean on physical files only, or also on active streaming devices, etc.
Also statistics, for example aggregations, on a set of mutimedia files,
like a full CD or DVD.
Maybe even calculated estimates like rhythm and dynamics.
So just "everything" that is feasible for that multimedia-type.


> Now I know what makes sense for MP3, and what I want from the
> multimedia files I CURRENTLY work with:
>
> x) (envelop) format ("MPEG4 inside AVI");
> x) subtype of (envelop) format (layer III [of MPEG1] in ODML'ed
> AVI);
> x) Has audio/stills/video/time-delayed-actions (slideshow?)?
> x) "Attributes" of each component (compressed audio? interlaced
> video? etc)
> x) Time duration (of each "component"?);
> x) Number of audio/video/still "frames"/pages;
> x) Framerate (of each "component");
> x) Byterate (of each recognized "component", and total);
> x) "Total filesize" of each component;
> x) Pixel size of (the first? largest?) image/video frame;
> x) PPI value for (the first?) image/video frame;
> x) Number of layers in a (video) frame? (# of Languages in audio?)
> x) Number of channels in each component (gray+alpha? RGB? Stereo?
> 7+1?)
> x) Bitdepth of a sample in each component/layer/channel;
> x) Gamma/color-profile/color-space?
> x) Has a "preview"/menu embedded?
>
> This is at least what I remember now...

Yes, all basic information and details that are available.
Calculated aggregations in a different information-layer, the framerate
from a tag doesn't have to be the same as the derived framerate.
A polling mechanism with callbacks to deliver live statistics.
Per multimedia-type a plugin. But that is all obvious, isn't it?

I prefer an aggregator per multi-media type, that stores "everything" it
can find and calculate in a complex intralinked datastructure (somewhat
exportable as XML), and a thin interface on top of it for simple queries
like title(), track(), etc. That aggregator should also have a
time-limited mode (do as much as you can in 2 seconds), and a lazy mode
(uses more memory, postpones calculations and CDDB-access etc.).

Is this what you are aiming for?

--
Affijn, Ruud

"Gewoon is een tijger."

Re: Multimedia file stats in Perl?

am 12.11.2006 23:02:46 von Ilya Zakharevich

[A complimentary Cc of this posting was NOT [per weedlist] sent to
Dr.Ruud
], who wrote in article :
> > x) (envelop) format ("MPEG4 inside AVI");
> > x) subtype of (envelop) format (layer III [of MPEG1] in ODML'ed
> > AVI);
> > x) Has audio/stills/video/time-delayed-actions (slideshow?)?
> > x) "Attributes" of each component (compressed audio? interlaced
> > video? etc)
> > x) Time duration (of each "component"?);
> > x) Number of audio/video/still "frames"/pages;
> > x) Framerate (of each "component");
> > x) Byterate (of each recognized "component", and total);
> > x) "Total filesize" of each component;
> > x) Pixel size of (the first? largest?) image/video frame;
> > x) PPI value for (the first?) image/video frame;
> > x) Number of layers in a (video) frame? (# of Languages in audio?)
> > x) Number of channels in each component (gray+alpha? RGB? Stereo?
> > 7+1?)
> > x) Bitdepth of a sample in each component/layer/channel;
> > x) Gamma/color-profile/color-space?
> > x) Has a "preview"/menu embedded?

> Yes, all basic information and details that are available.
> Calculated aggregations in a different information-layer, the framerate
> from a tag doesn't have to be the same as the derived framerate.

MP3::Tag has configurable options to handle such possible discrepancies...

> A polling mechanism with callbacks to deliver live statistics.

This is too much for my current purposes. Just static "header-like"
info available *without accessing the actual contents*.

Contents analysis is also important (compare with Audio::FindChunks),
but is, IMO, somewhat orthogonal topic.

> I prefer an aggregator per multi-media type, that stores "everything" it
> can find and calculate in a complex intralinked datastructure (somewhat
> exportable as XML), and a thin interface on top of it for simple queries
> like title(), track(), etc. That aggregator should also have a
> time-limited mode (do as much as you can in 2 seconds), and a lazy mode
> (uses more memory, postpones calculations and CDDB-access etc.).
>
> Is this what you are aiming for?

I do not think so. More like a header analyser than a data
analyser... Of course, with some brain-damaged header formats, one
may need to actually count frames to find their number; but at least
one does not need to decode frames.

Hope this helps,
Ilya