1 Attachment(s)

Newbie: Probability density function ???

Hi Every, Its my 1st post on this forum and would like to know some basics of Stochastic Process. One of which is** Probability density function (PDF)**.

1. what is a PDF and what information can be extracted from a given PDF.

2. Could be nice if example is illustrated as daily weather forecast or daily electricity usage in a certain area.

**Reason:** currently I have a yearly data of Electricity market, I am thinking of possible arrangements. One that I did was: collect data of every sunday of every week i.e 53 sundays per year as shown in the pic.... what could be other possibilities)

Attachment 27363

3. from the given pic above, how can I define a PDF (what component do I need)

Re: Newbie: Probability density function ???

Hey luckyali.

For 1) A PDF represents eitheru) the probability of getting a particular event (if it is discrete) or it represents a function that is used to obtain the probability of a non-zero length interval (in the case of a continuous distribution).

The PDF can be interpreted in different ways but you should think of it as showing the long term behavior of the stochastic process if you carried out the process and infinite number of times.

Hint: For your other problem you can get a PDF by taking a set of frequency data and dividing all cells by the total frequency count (i.e. the sum of all frequencies).

1 Attachment(s)

Re: Newbie: Probability density function ???

Actually I was given an Annual Electricity hourly usage data. (365x24). NOw I took the means of 24 hours a day so I have now (365x1) vector. If i draw the histogram I get this thing...

Attachment 27612

I can understand y-axis represents the frequency of x-elements. but how can I extract probabilities from this histogram. or in other words how can i draw a PDF from histogram.

the command, [F,X]=hist(data)) gives me F = 4 11 14 14 17 8 5 22 30 42 38 26 23 13 11 10 24 20 23 10; for given X (sum(f)=365 days correct!)

am I correct to use P = F/sum(F) to get the probability of occurance of each bar... if yes then how to display these probabilities on the graph

Re: Newbie: Probability density function ???

The bar graph gives usage per day over a 365 day period.

If I divide each y entry of the bar graph by the total usage I get:

y = pdf = usage at time t per day/total usage.

If I draw a smooth curve through this I can integrate from 0 to t to get fraction of yearly usage after t days, or integrate from t1 to t2 to get fraction of yearly usage over the time period t1 to t2, or just sum area of bar graph from 0 to t or from t1 to t2.

Probability Background:

In general, if you have a bar graph with y showing number of occurrences for each of N intervals, and then divide each y entry by the total number of occurences, you have a probability density distribution. If the intervals are time, then the PDF, y=p(t), has units of probability of occurence/time. Then the probability of the event occurring betweet t1 and t2 is ∫_{t1}^{t2} p(t)dt (assuming you draw a curve through the bar graph, otherwise it’s a finite sum (area)).

The area under the original bar graph (occurrence distribution) from t1 to t2 gives the total number of occurences from t1 to t2. The area under the pdf from t1 to t2 gives the probability of occurrence from t1 to t2.

For example, the occurrence might be light bulb failures in a subway system plotted as a function of time from installation of all new bulbs till the last bulb fails, with a bar graph showing number of failures in hourly intervals. If I divide each y entry by the total number of light bulbs, I get:

y = pdf = Number of light bulbs failed at time t per hour/ total number of light bulbs

y = fraction of light bulbs failed at time t per hour.

If I draw a continuous curve through the bar graph, I can integrate y from 0 to t to find the fraction of bulbs failed after t hours (x100 for percentage).

EDIT: I believe the NYC subway system decides on a certain tolerable percentage of light bulb failures and finds the time at which this occurrs from the pdf, reaches a compromise solution (max time and max tolerable failures), and then changes ALL light bulbs at that time because it's cheaper than replacing a bulb every time there is a light bulb failure.

Re: Newbie: Probability density function ???

Ex: Usage

Time........... usage....... pdf , useage per mo/(total useage)

0-3mos ...... 300.......... 300/(3x2100)

3-6mos....... 600.......... 600/(3x2100)

6-9mos ...... 900.......... 900/(3x2100)

9-12mos..... 300.......... 300/(3x2100)

fraction of yearly useage in first mo: 1x300/(3x2100)

fraction of yearly useage from 4mos to 7mos: 2x600/(3x2100) + 1x900/(3x2100)

multiply by 100 for percentage.

1 Attachment(s)

Re: Newbie: Probability density function ???

Quote:

Originally Posted by

**Hartlw** The bar graph gives usage per day over a 365 day period.

If I divide each y entry of the bar graph by the total usage I get:

y = pdf = usage at time t per day/total usage.

for this I used a command in Matlab as, and got following figure.

[F X]= hist(daily_mean,20); % F=frequency of bins, X= data value

bar(X,F/sum(F))

Attachment 27723

do you think the Y axis shows the correct probabilities of X. If not then please correct me

Re: Newbie: Probability density function ???

Sorry, I don't understand MatLab or the graph.

I note you have twenty bars. Does this mean you have divided the year into twenty parts? In that case each bar width is 18.25 days and each bar height is useage over that 18.25 day period divided by yearly useage, with units (useage/totaluseage)/18.25days

You should label the units of the graph.

Re: Newbie: Probability density function ???

Test your program with a simple case:

Time........... usage....... pdf , useage per mo/(total useage)

0-3mos ...... 300.......... 300/(3x2100)

3-6mos....... 600.......... 600/(3x2100)

6-9mos ...... 900.......... 900/(3x2100)

9-12mos..... 300.......... 300/(3x2100)

Then post the result and I'll tell you if it's right.

Hint: you should have four bars with heights shown.