Reduce dataset but still show anomalies

Reduce dataset but still show anomalies

am 20.08.2010 17:12:03 von bcantwell

--=-hmj0te466yTmvMU2Vypc
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit

I am trying to produce charts for large amounts of data. I already limit
the user to a smaller time frame in order to reduce the possible data
points, but still can end up with far more data points than are clearly
plottable on a chart. Does anyone have an idea of how I can drop
insignificant points, or average the data or do something to end up with
no more than about 3k points and still show spikes and dips in the
charts so my users can still clearly identify anomalies in their charts?
I don't want to smooth out the spikes and dips if at all possible.
I considered running through the dataset and doing a compare of point 2
to point 1 and if it is close in value throw it away, otherwise keep it.
That probably would not work on a 'noisy' chart however...


THanks,
Bryancan

--=-hmj0te466yTmvMU2Vypc--

Re: Reduce dataset but still show anomalies

am 20.08.2010 17:16:02 von Jangita

On 20/08/2010 5:12 p, Bryan Cantwell wrote:
> I am trying to produce charts for large amounts of data. I already limit
> the user to a smaller time frame in order to reduce the possible data
> points, but still can end up with far more data points than are clearly
> plottable on a chart. Does anyone have an idea of how I can drop
> insignificant points, or average the data or do something to end up with
> no more than about 3k points and still show spikes and dips in the
> charts so my users can still clearly identify anomalies in their charts?
> I don't want to smooth out the spikes and dips if at all possible.
> I considered running through the dataset and doing a compare of point 2
> to point 1 and if it is close in value throw it away, otherwise keep it.
> That probably would not work on a 'noisy' chart however...
>
>
> THanks,
> Bryancan
>
Have you tried instead of showing per minute, show the average per hour,
or per day; this will generally smoothen the points out a little; In my
case if i show registrations per day i get dips every Saturday and
Sunday so it looks all jagged, But per week doesn't show the
Saturday/Sunday dips...

--
Jangita | +256 76 91 8383 | Y! & MSN: jangita@yahoo.com
Skype: jangita | GTalk: jangita.nyagudi@gmail.com

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org

Re: Reduce dataset but still show anomalies

am 20.08.2010 17:24:26 von bcantwell

--=-3QP2imZi0mtkbeG5d7I5
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit

Yes, but I DON'T want eh spikes smoothed out

On Fri, 2010-08-20 at 17:16 +0200, Jangita wrote:

> On 20/08/2010 5:12 p, Bryan Cantwell wrote:
> > I am trying to produce charts for large amounts of data. I already limit
> > the user to a smaller time frame in order to reduce the possible data
> > points, but still can end up with far more data points than are clearly
> > plottable on a chart. Does anyone have an idea of how I can drop
> > insignificant points, or average the data or do something to end up with
> > no more than about 3k points and still show spikes and dips in the
> > charts so my users can still clearly identify anomalies in their charts?
> > I don't want to smooth out the spikes and dips if at all possible.
> > I considered running through the dataset and doing a compare of point 2
> > to point 1 and if it is close in value throw it away, otherwise keep it.
> > That probably would not work on a 'noisy' chart however...
> >
> >
> > THanks,
> > Bryancan
> >
> Have you tried instead of showing per minute, show the average per hour,
> or per day; this will generally smoothen the points out a little; In my
> case if i show registrations per day i get dips every Saturday and
> Sunday so it looks all jagged, But per week doesn't show the
> Saturday/Sunday dips...
>
> --
> Jangita | +256 76 91 8383 | Y! & MSN: jangita@yahoo.com
> Skype: jangita | GTalk: jangita.nyagudi@gmail.com
>



--=-3QP2imZi0mtkbeG5d7I5--

Re: Reduce dataset but still show anomalies

am 20.08.2010 17:30:49 von Philip Riebold

On 20 Aug 2010, at 16:24, Bryan Cantwell wrote:

> Yes, but I DON'T want eh spikes smoothed out

Display the max and min of each successive set of 10 (or 100 or 1000) =
elements from the data ?

--
TTFN.

Philip Riebold, p.riebold@ucl.ac.uk

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=3Dgcdmg-mysql-2@m.gmane.o rg

RE: Reduce dataset but still show anomalies

am 20.08.2010 17:32:19 von Steven Staples

I am not too good with charting (even though I would like to be), but =
what about getting the max, min and avg, if the max/min is greater than =
x% of the avg, show that... ?

Just throwing out ideas... prolly not useful... but may cause a better =
idea ;)


Steven Staples


> -----Original Message-----
> From: Bryan Cantwell [mailto:bcantwell@firescope.com]
> Sent: August 20, 2010 11:24 AM
> To: mysql
> Subject: Re: Reduce dataset but still show anomalies
>=20
> Yes, but I DON'T want eh spikes smoothed out
>=20
> On Fri, 2010-08-20 at 17:16 +0200, Jangita wrote:
>=20
> > On 20/08/2010 5:12 p, Bryan Cantwell wrote:
> > > I am trying to produce charts for large amounts of data. I already
> limit
> > > the user to a smaller time frame in order to reduce the possible =
data
> > > points, but still can end up with far more data points than are =
clearly
> > > plottable on a chart. Does anyone have an idea of how I can drop
> > > insignificant points, or average the data or do something to end =
up
> with
> > > no more than about 3k points and still show spikes and dips in the
> > > charts so my users can still clearly identify anomalies in their
> charts?
> > > I don't want to smooth out the spikes and dips if at all possible.
> > > I considered running through the dataset and doing a compare of =
point 2
> > > to point 1 and if it is close in value throw it away, otherwise =
keep
> it.
> > > That probably would not work on a 'noisy' chart however...
> > >
> > >
> > > THanks,
> > > Bryancan
> > >
> > Have you tried instead of showing per minute, show the average per =
hour,
> > or per day; this will generally smoothen the points out a little; In =
my
> > case if i show registrations per day i get dips every Saturday and
> > Sunday so it looks all jagged, But per week doesn't show the
> > Saturday/Sunday dips...
> >
> > --
> > Jangita | +256 76 91 8383 | Y! & MSN: jangita@yahoo.com
> > Skype: jangita | GTalk: jangita.nyagudi@gmail.com
> >
>=20
>=20
>=20
> No virus found in this incoming message.
> Checked by AVG - www.avg.com
> Version: 9.0.851 / Virus Database: 271.1.1/3023 - Release Date: =
08/20/10
> 02:35:00


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=3Dgcdmg-mysql-2@m.gmane.o rg

Re: Reduce dataset but still show anomalies

am 20.08.2010 17:34:08 von Jangita

On 20/08/2010 5:24 p, Bryan Cantwell wrote:
> Yes, but I DON'T want eh spikes smoothed out
>
> On Fri, 2010-08-20 at 17:16 +0200, Jangita wrote:
>
>> On 20/08/2010 5:12 p, Bryan Cantwell wrote:
>>> I am trying to produce charts for large amounts of data. I already limit
>>> the user to a smaller time frame in order to reduce the possible data
>>> points, but still can end up with far more data points than are clearly
>>> plottable on a chart. Does anyone have an idea of how I can drop
>>> insignificant points, or average the data or do something to end up with
>>> no more than about 3k points and still show spikes and dips in the
>>> charts so my users can still clearly identify anomalies in their charts?
>>> I don't want to smooth out the spikes and dips if at all possible.
>>> I considered running through the dataset and doing a compare of point 2
>>> to point 1 and if it is close in value throw it away, otherwise keep it.
>>> That probably would not work on a 'noisy' chart however...
>>>
>>>
>>> THanks,
>>> Bryancan
>>>
>> Have you tried instead of showing per minute, show the average per hour,
>> or per day; this will generally smoothen the points out a little; In my
>> case if i show registrations per day i get dips every Saturday and
>> Sunday so it looks all jagged, But per week doesn't show the
>> Saturday/Sunday dips...
>>
>> --
>> Jangita | +256 76 91 8383 | Y!& MSN: jangita@yahoo.com
>> Skype: jangita | GTalk: jangita.nyagudi@gmail.com
>>
>
>
>
Hmm, how do you reduce the number of points and average it out without
smoothing out the spikes? by its nature, the more you average the
smoother the spikes are or? eg

x,y
1,3
2,7
3,15 (spike)
4,4
5,3
6,3

averaged
range(x),y
1-2(1.5), 5 (3+7/2)
3-4(3.5), 9.5
5-6(5.5), 3

We have 3 points instead of 6, but also the u value spike from 7 to 15
has been smoothened from 5 to 9.5

OR, am I missing something? You may want to use another formula instead
of averaging.
--
Jangita | +256 76 91 8383 | Y! & MSN: jangita@yahoo.com
Skype: jangita | GTalk: jangita.nyagudi@gmail.com

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org

RE: Reduce dataset but still show anomalies

am 20.08.2010 17:36:46 von Steven Staples

On another thought, what about if you group it by whatever, if the =
MIN()/MAX() is greater than X times STDDEV(), show MIN() or MAX() ?

I just recalled a conversation with my boss the other week about the =
STDDEV()


Steven Staples



> -----Original Message-----
> From: Steven Staples [mailto:sstaples@mnsi.net]
> Sent: August 20, 2010 11:32 AM
> To: bcantwell@firescope.com; 'mysql'
> Subject: RE: Reduce dataset but still show anomalies
>=20
> I am not too good with charting (even though I would like to be), but =
what
> about getting the max, min and avg, if the max/min is greater than x% =
of
> the avg, show that... ?
>=20
> Just throwing out ideas... prolly not useful... but may cause a better =
idea
> ;)
>=20
>=20
> Steven Staples
>=20
>=20
> > -----Original Message-----
> > From: Bryan Cantwell [mailto:bcantwell@firescope.com]
> > Sent: August 20, 2010 11:24 AM
> > To: mysql
> > Subject: Re: Reduce dataset but still show anomalies
> >
> > Yes, but I DON'T want eh spikes smoothed out
> >
> > On Fri, 2010-08-20 at 17:16 +0200, Jangita wrote:
> >
> > > On 20/08/2010 5:12 p, Bryan Cantwell wrote:
> > > > I am trying to produce charts for large amounts of data. I =
already
> > limit
> > > > the user to a smaller time frame in order to reduce the possible =
data
> > > > points, but still can end up with far more data points than are
> clearly
> > > > plottable on a chart. Does anyone have an idea of how I can =
drop
> > > > insignificant points, or average the data or do something to end =
up
> > with
> > > > no more than about 3k points and still show spikes and dips in =
the
> > > > charts so my users can still clearly identify anomalies in their
> > charts?
> > > > I don't want to smooth out the spikes and dips if at all =
possible.
> > > > I considered running through the dataset and doing a compare of =
point
> 2
> > > > to point 1 and if it is close in value throw it away, otherwise =
keep
> > it.
> > > > That probably would not work on a 'noisy' chart however...
> > > >
> > > >
> > > > THanks,
> > > > Bryancan
> > > >
> > > Have you tried instead of showing per minute, show the average per
> hour,
> > > or per day; this will generally smoothen the points out a little; =
In my
> > > case if i show registrations per day i get dips every Saturday and
> > > Sunday so it looks all jagged, But per week doesn't show the
> > > Saturday/Sunday dips...
> > >
> > > --
> > > Jangita | +256 76 91 8383 | Y! & MSN: jangita@yahoo.com
> > > Skype: jangita | GTalk: jangita.nyagudi@gmail.com
> > >
> >
> >
> >
> > No virus found in this incoming message.
> > Checked by AVG - www.avg.com
> > Version: 9.0.851 / Virus Database: 271.1.1/3023 - Release Date: =
08/20/10
> > 02:35:00
>=20
>=20
> --
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe: =
http://lists.mysql.com/mysql?unsub=3Dsstaples@mnsi.net
>=20
> No virus found in this incoming message.
> Checked by AVG - www.avg.com
> Version: 9.0.851 / Virus Database: 271.1.1/3023 - Release Date: =
08/20/10
> 02:35:00


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=3Dgcdmg-mysql-2@m.gmane.o rg