I've posted this before numerous times, but here's a quick rundown of how the cognitive process works and how it relates to music:
The first thing to notice and acknowledge is the overlapping between the language and the music areas of the brain, as music uses many of the same mechanisms we use when we hear people talk. In essence, our ability to draw out emotion out of a piece has a two facets.
1) We are wired to extrapolate emotional content out of arbitrary patterns, like we do in speech. For example, an angry person shouting in a language you don't understand can (imagine this over the phone) intimidate you or provoke various emotions. It's not because of the content of the speech but by the way it's presented. In this fashion we can identify 3 basic emotions no matter what kind of music we hear: Happiness, Anger and Sadness.
Example A: A person who is depressed will often speak slowly and with a lower tone. These are some of the characteristics we often extrapolate in music as "sadness," regardless of what we're hearing.
Example B: Aggressiveness in music tends to communicate anger just like shouting at someone will, again elements overlap here and it's easy to see the connection.
2) There is a mechanism by which we are chemically rewarded when we are surprised by syntax change within an establish context. This means, that a diminished chord by Bach in a rather standard T-D-T cadence is meant to achieve exactly a kind of balance between being harsh and being still predictable enough. This leads to a kind of "aesthetic curve" where this middle ground is at the center of a lot of music changes. For example an interrupted cadence in C major often is A minor due to similarity in the notes, despite it being an entirely different key. It's "far enough" that you notice, but not too far that it bothers.
This phenomenon exists, of course, in speech as well. It happens when you read a sentence like for example: "I'll install some Betty duck airplane." Spoken out loud anyone's reaction will be along the same lines of what happens when there is an allowed break in harmony. This is also the basis of a lot of literary principles in forming sentences and etc etc.
Because this works on the basis of established syntax, it needs context for it to work and hence this is where culture comes into play greatly. This break in syntax can only happen if you're able to predict what -should- come, and instead what you get is something else. In a language/music where you can't do it, it's impossible to get this payoff to work. This is also, I suspect, the reason modern music hard to get into, as it takes a while to assimilate many new elements until they begin working this way (hear enough atonality and you'll find "breaks" from it, just like in any kind of syntax within a certain context.)
Example A: the Neapolitan cadence is a good old example of (Cadence) harmony that is extremely powerful (you couldn't GET more dissonant back then than this,) and it works precisely because what you expect is similar to what you get, but different enough that it makes you react.
Example B: Augmented chords took a long while to become single-use chords within a harmonic context. Liszt was one of the first to attempt to use them on their own entirely without resorting to passing notes as a way to "legitimize them." The reason is that the sound created by an augmented chord on it's own is "too far" from other sounds within a context therefore not a good stand-alone chord. Within a created context however, it's used extensively as the effect is diminished through the use of passing notes, Eg T -> S by means of a progressively raised 5th to the 3rd of the S. This is typical by Schubert, for example. It effectively creates an augmented chord, but only in passing and with pedal tones that ease off the dissonance. Compare with Mozart's rather pioneering minuet (Minuet in D KV 355-576b) where the chord is used by itself (but resolved chromatically.)
---
Further reading:
Towards a neural basis of music-evoked emotions (Trends Cog Sci, 2010)
Processing Expectancy Violations during Music Performance and Perception: An ERP Study (J Cog Neurosci, in press)
Universal Recognition of Three Basic Emotions in Music (Current Biology, 2009)
:>